From casey.obrien.r at gmail.com Tue Dec 13 23:17:05 2011 From: casey.obrien.r at gmail.com (Casey Ransberger) Date: Tue Dec 13 23:16:19 2011 Subject: [Ometa] Debugging PEGs and Packrats Message-ID: <00D6B793-E724-4C5C-B217-010193DEB6C8@gmail.com> I know this has come up before. Hopefully I'm not about to repeat a lot. Debugging this stuff just seems really hard. And significantly harder than what I've experienced working with e.g. Yacc. Hypothesis: Yacc had a lot of time to bake before I ever found it. PEGs are new, so there's been less overall experience with debugging them. I've experimented in what little time I can devote with OMeta, PetitParser, and Treetop. The debugging experience has been roughly consistent across all three. One particular issue which has bugged me: memoization seems to carry a lot of instance-state that's really hard to comprehend when the grammar isn't working as I expect. It's just really hard to use that ocean of information to figure out what I've done wrong. Given that with these new parsing technologies, we're pretty lucky to see "parse error" as an error message, I can't help but think that it's worth studying debugging strategies. Heh. :D I'm really not complaining, I'm just pointing it out. Has anyone here found any technique(s) which makes debugging a grammar written for a PEG/packrat less of a pain in the butt? I'd be really interested in hearing about it. From rafkind at cs.utah.edu Tue Dec 13 23:46:21 2011 From: rafkind at cs.utah.edu (Jon Rafkind) Date: Tue Dec 13 23:45:33 2011 Subject: [Ometa] Debugging PEGs and Packrats In-Reply-To: <00D6B793-E724-4C5C-B217-010193DEB6C8@gmail.com> References: <00D6B793-E724-4C5C-B217-010193DEB6C8@gmail.com> Message-ID: <4EE8544D.4030903@cs.utah.edu> On 12/14/2011 12:17 AM, Casey Ransberger wrote: > I know this has come up before. Hopefully I'm not about to repeat a lot. > > Debugging this stuff just seems really hard. And significantly harder than what I've experienced working with e.g. Yacc. > > Hypothesis: Yacc had a lot of time to bake before I ever found it. PEGs are new, so there's been less overall experience with debugging them. > > I've experimented in what little time I can devote with OMeta, PetitParser, and Treetop. The debugging experience has been roughly consistent across all three. > > One particular issue which has bugged me: memoization seems to carry a lot of instance-state that's really hard to comprehend when the grammar isn't working as I expect. It's just really hard to use that ocean of information to figure out what I've done wrong. > > Given that with these new parsing technologies, we're pretty lucky to see "parse error" as an error message, I can't help but think that it's worth studying debugging strategies. Heh. :D I'm really not complaining, I'm just pointing it out. > > Has anyone here found any technique(s) which makes debugging a grammar written for a PEG/packrat less of a pain in the butt? > > I'd be really interested in hearing about it. I have my PEG generator print out a trace of all decisions it made and what state the parser is in for each character. So something like at 0 'x' in the start rule first production in start rule new rule parse_x at 0 'x' in the parse_x rule succeeded with parse_x at 0 new rule parse_y at 1 'y' in the parse_y rule ... It can be a lot of information to process especially with large grammars and large inputs but its usually not too hard to see where the peg made an unexpected decision. Also your question hints at how to make PEG's give reasonable error messages instead of just 'parse error'. There was some work on producing good error messages with PEG's, although I can't remember the authors right now. My current solution is to assume the rule that got the farthest (in terms of input read) is the most likely candidate to report an error. Actually I produce the entire ancestry of a rule so it looks like start -> statement -> variable -> identifier But in all honesty the lineage hasn't been super helpful. From justin.m.chase at gmail.com Wed Dec 14 09:54:46 2011 From: justin.m.chase at gmail.com (Justin Chase) Date: Wed Dec 14 09:53:55 2011 Subject: [Ometa] Debugging PEGs and Packrats In-Reply-To: <4EE8544D.4030903@cs.utah.edu> References: <00D6B793-E724-4C5C-B217-010193DEB6C8@gmail.com> <4EE8544D.4030903@cs.utah.edu> Message-ID: SSd2ZSBiZWVuIHdvcmtpbmcgb24gYW4gb21ldGEgdmFyaWFudCBhbmQgaW4gbXkgcGFyc2VyIEkg YWRkZWQgdHdvIG5ldwp0eXBlcyBvZiBzZW1hbnRpY3MgImVycm9yIHVudGlsIiBhbmQgImVycm9y IHVubGVzcyIgd2hpY2ggd2hlbiB1c2VkIGluIHlvdXIKZ3JhbW1hciBjYW4gcmVhbGx5IGhlbHAg bmFycm93IGRvd24gcGxhY2VzIHdoZXJlIGVycm9ycyBvY2N1ci4gQWxzbyBJIGNhbWUKdG8gdGhl IGNvbmNsdXNpb24gdGhhdCBhIHBhcnNlIGF0dGVtcHQgc2hvdWxkIGFsd2F5cyBjb25zdW1lIHRo ZSBlbnRpcmUKaW5wdXQgc28gSSBhZGQgYW4gaW1wbGljaXQgImVycm9yIHVudGlsICFhbnkiIGF0 IHRoZSBlbmQgb2YgYWxsIGdyYW1tYXJzLgoKVGhlIGxvZ2ljIG9mIHRoZSBydWxlcyAoaW4gcHN1 ZWRvLWNvZGUpIGlzIGxpa2UgdGhpczoKCmRlZiBFcnJvclVubGVzcyhjb250ZXh0LCBzdHJlYW0s IHBhdHRlcm4pOgoJCQoJdmFyIG1hdGNoID0gcGF0dGVybihjb250ZXh0LCBzdHJlYW0pOwoJaWYg KG1hdGNoIGlzIEZhaWwpOgoJCW1hdGNoID0gbnVsbAoJCWNvbnRleHQuTG9nRXJyb3Ioc3RyZWFt LlBvc2l0aW9uLCAuLi4pOwoKCXJldHVybiBtYXRjaDsKCgpkZWYgRXJyb3JVbnRpbChjb250ZXh0 LCBzdHJlYW0sIHBhdHRlcm4pOgoKCXZhciBzdGFydCA9IHN0cmVhbS5Qb3NpdGlvbjsKCXZhciBl bmQgPSBzdGFydDsKCXRyeToKCQl3aGlsZSAoIXN0cmVhbS5Qb3NpdGlvbi5Jc0VuZCk6CgkJCXZh ciBtYXRjaCA9IHBhdHRlcm4oY29udGV4dCwgc3RyZWFtKTsKCQkJaWYgKCEobWF0Y2ggaXMgRmFp bCkpCgkJCQlyZXR1cm4gbWF0Y2g7CgoJCQlpZiAoIXN0cmVhbS5Nb3ZlTmV4dCgpKQoJCQkJYnJl YWs7CgoJCQllbmQgPSBzdHJlYW0uUG9zaXRpb247CgoJCXJldHVybiBudWxsOwoJCglmaW5hbGx5 OgoJCWlmIChzdGFydCAhPSBlbmQpOgoJCQljb250ZXh0LkxvZ0Vycm9yKHN0YXJ0Lk5leHQoKSwg ZW5kLCAuLi4pOwoKCkFuZCBjYW4gYmUgdXNlZCBpbiBydWxlcyBsaWtlIHRoaXM6CgpTdGF0ZW1l bnQgPSBTdGF0ZW1lbnRFeHByZXNzaW9uIGVycm9yIHVubGVzcyAiOyIKR3JvdXBFeHByZXNzaW9u ID0gIigiIEV4cHJlc3Npb24gZXJyb3IgdW50aWwgIikiCgoKRXJyb3IgdW5sZXNzIGhhcyB0aGUg ZWZmZWN0IG9mIHNraXBwaW5nIHdoYXQgc2VlbXMgdG8gYmUgbWlzc2luZyB0b2tlbnMKd2hpbGUg ZXJyb3IgdW50aWwgaGFzIHRoZSBlZmZlY3Qgb2YgY29uc3VtaW5nIHVuZXhwZWN0ZWQgdG9rZW5z IHVudGlsCnJlYWNoaW5nIGEga25vd24gZ29pbmcgcG9pbnQgYWdhaW4uIEl0IGRvZXNuJ3QgaGVs cCB5b3Ugd2l0aCBldmFsdWF0aW5nIHRoZQptZW1vcyB3aGlsZSBkZWJ1Z2dpbmcgYnV0IGl0J3Mg YSBwcmV0dHkgaGlnaCBsZXZlbCB3YXkgdG8gZmlndXJlIG91dCB3aGVyZQp3aGF0IHlvdSdyZSBl eHBlY3RpbmcgaXMgZGlmZmVyZW50IGZyb20gd2hhdCBpcyBhY3R1YWxseSBoYXBwZW5pbmcuIEl0 IGFsc28KbGV0cyB5b3UgcHV0IGVycm9yIHNlbWFudGljcyByaWdodCBpbiB0aGUgZ3JhbW1hciwg d2hpY2ggZnJvbSB3aGF0IEkKdW5kZXJzdGFuZCBpcyB3aHkgc29tZSBwcm9mZXNzaW9uYWwgbGFu Z3VhZ2UgZGV2ZWxvcGVycyBlbmQgdXAgaGFuZC13cml0aW5nCnRoZWlyIHBhcnNlcnMgcmF0aGVy IHRoYW4gdXNpbmcgYSBwcm9wZXIgZ3JhbW1hciB0b29sLiBTbyB0aGF0J3MgbXkgYXR0ZW1wdAp0 byBoZWxwIHdpdGggZGVidWdnaW5nLiBZb3UgY2FuIGFsc28gY2FsbCBjb250ZXh0LkVycm9yKC4u LikgaW4gcHJvZHVjdGlvbnMKdG8gbWFrZSB5b3VyIG93biBlcnJvciBjYXNlcy4gRXJyb3JzIG9u bHkgcHJvcGFnYXRlIGlmIHRoZWlyIGJyYW5jaCBvZiB0aGUKZ3JhbW1hciBlbmRzIHVwIG1hdGNo aW5nLgoKCgoKT24gV2VkLCBEZWMgMTQsIDIwMTEgYXQgMTo0NiBBTSwgSm9uIFJhZmtpbmQgPHJh ZmtpbmRAY3MudXRhaC5lZHU+IHdyb3RlOgoKPiBPbiAxMi8xNC8yMDExIDEyOjE3IEFNLCBDYXNl eSBSYW5zYmVyZ2VyIHdyb3RlOgo+ID4gSSBrbm93IHRoaXMgaGFzIGNvbWUgdXAgYmVmb3JlLiBI b3BlZnVsbHkgSSdtIG5vdCBhYm91dCB0byByZXBlYXQgYSBsb3QuCj4gPgo+ID4gRGVidWdnaW5n IHRoaXMgc3R1ZmYganVzdCBzZWVtcyByZWFsbHkgaGFyZC4gQW5kIHNpZ25pZmljYW50bHkgaGFy ZGVyCj4gdGhhbiB3aGF0IEkndmUgZXhwZXJpZW5jZWQgd29ya2luZyB3aXRoIGUuZy4gWWFjYy4K PiA+Cj4gPiBIeXBvdGhlc2lzOiBZYWNjIGhhZCBhIGxvdCBvZiB0aW1lIHRvIGJha2UgYmVmb3Jl IEkgZXZlciBmb3VuZCBpdC4gUEVHcwo+IGFyZSBuZXcsIHNvIHRoZXJlJ3MgYmVlbiBsZXNzIG92 ZXJhbGwgZXhwZXJpZW5jZSB3aXRoIGRlYnVnZ2luZyB0aGVtLgo+ID4KPiA+IEkndmUgZXhwZXJp bWVudGVkIGluIHdoYXQgbGl0dGxlIHRpbWUgSSBjYW4gZGV2b3RlIHdpdGggT01ldGEsCj4gUGV0 aXRQYXJzZXIsIGFuZCBUcmVldG9wLiBUaGUgZGVidWdnaW5nIGV4cGVyaWVuY2UgaGFzIGJlZW4g cm91Z2hseQo+IGNvbnNpc3RlbnQgYWNyb3NzIGFsbCB0aHJlZS4KPiA+Cj4gPiBPbmUgcGFydGlj dWxhciBpc3N1ZSB3aGljaCBoYXMgYnVnZ2VkIG1lOiBtZW1vaXphdGlvbiBzZWVtcyB0byBjYXJy eSBhCj4gbG90IG9mIGluc3RhbmNlLXN0YXRlIHRoYXQncyByZWFsbHkgaGFyZCB0byBjb21wcmVo ZW5kIHdoZW4gdGhlIGdyYW1tYXIKPiBpc24ndCB3b3JraW5nIGFzIEkgZXhwZWN0LiBJdCdzIGp1 c3QgcmVhbGx5IGhhcmQgdG8gdXNlIHRoYXQgb2NlYW4gb2YKPiBpbmZvcm1hdGlvbiB0byBmaWd1 cmUgb3V0IHdoYXQgSSd2ZSBkb25lIHdyb25nLgo+ID4KPiA+IEdpdmVuIHRoYXQgd2l0aCB0aGVz ZSBuZXcgcGFyc2luZyB0ZWNobm9sb2dpZXMsIHdlJ3JlIHByZXR0eSBsdWNreSB0bwo+IHNlZSAi cGFyc2UgZXJyb3IiIGFzIGFuIGVycm9yIG1lc3NhZ2UsIEkgY2FuJ3QgaGVscCBidXQgdGhpbmsg dGhhdCBpdCdzCj4gd29ydGggc3R1ZHlpbmcgZGVidWdnaW5nIHN0cmF0ZWdpZXMuIEhlaC4gOkQg SSdtIHJlYWxseSBub3QgY29tcGxhaW5pbmcsCj4gSSdtIGp1c3QgcG9pbnRpbmcgaXQgb3V0Lgo+ ID4KPiA+IEhhcyBhbnlvbmUgaGVyZSBmb3VuZCBhbnkgdGVjaG5pcXVlKHMpIHdoaWNoIG1ha2Vz IGRlYnVnZ2luZyBhIGdyYW1tYXIKPiB3cml0dGVuIGZvciBhIFBFRy9wYWNrcmF0IGxlc3Mgb2Yg YSBwYWluIGluIHRoZSBidXR0Pwo+ID4KPiA+IEknZCBiZSByZWFsbHkgaW50ZXJlc3RlZCBpbiBo ZWFyaW5nIGFib3V0IGl0Lgo+Cj4gSSBoYXZlIG15IFBFRyBnZW5lcmF0b3IgcHJpbnQgb3V0IGEg dHJhY2Ugb2YgYWxsIGRlY2lzaW9ucyBpdCBtYWRlIGFuZAo+IHdoYXQgc3RhdGUgdGhlIHBhcnNl ciBpcyBpbiBmb3IgZWFjaCBjaGFyYWN0ZXIuIFNvIHNvbWV0aGluZyBsaWtlCj4KPiBhdCAwICd4 JyBpbiB0aGUgc3RhcnQgcnVsZQo+IGZpcnN0IHByb2R1Y3Rpb24gaW4gc3RhcnQgcnVsZQo+IG5l dyBydWxlIHBhcnNlX3gKPiBhdCAwICd4JyBpbiB0aGUgcGFyc2VfeCBydWxlCj4gc3VjY2VlZGVk IHdpdGggcGFyc2VfeCBhdCAwCj4gbmV3IHJ1bGUgcGFyc2VfeQo+IGF0IDEgJ3knIGluIHRoZSBw YXJzZV95IHJ1bGUKPiAuLi4KPgo+IEl0IGNhbiBiZSBhIGxvdCBvZiBpbmZvcm1hdGlvbiB0byBw cm9jZXNzIGVzcGVjaWFsbHkgd2l0aCBsYXJnZSBncmFtbWFycwo+IGFuZCBsYXJnZSBpbnB1dHMg YnV0IGl0cyB1c3VhbGx5IG5vdCB0b28gaGFyZCB0byBzZWUgd2hlcmUgdGhlIHBlZyBtYWRlIGFu Cj4gdW5leHBlY3RlZCBkZWNpc2lvbi4KPgo+IEFsc28geW91ciBxdWVzdGlvbiBoaW50cyBhdCBo b3cgdG8gbWFrZSBQRUcncyBnaXZlIHJlYXNvbmFibGUgZXJyb3IKPiBtZXNzYWdlcyBpbnN0ZWFk IG9mIGp1c3QgJ3BhcnNlIGVycm9yJy4gVGhlcmUgd2FzIHNvbWUgd29yayBvbiBwcm9kdWNpbmcK PiBnb29kIGVycm9yIG1lc3NhZ2VzIHdpdGggUEVHJ3MsIGFsdGhvdWdoIEkgY2FuJ3QgcmVtZW1i ZXIgdGhlIGF1dGhvcnMgcmlnaHQKPiBub3cuIE15IGN1cnJlbnQgc29sdXRpb24gaXMgdG8gYXNz dW1lIHRoZSBydWxlIHRoYXQgZ290IHRoZSBmYXJ0aGVzdCAoaW4KPiB0ZXJtcyBvZiBpbnB1dCBy ZWFkKSBpcyB0aGUgbW9zdCBsaWtlbHkgY2FuZGlkYXRlIHRvIHJlcG9ydCBhbiBlcnJvci4KPiBB Y3R1YWxseSBJIHByb2R1Y2UgdGhlIGVudGlyZSBhbmNlc3RyeSBvZiBhIHJ1bGUgc28gaXQgbG9v a3MgbGlrZQo+Cj4gc3RhcnQgLT4gc3RhdGVtZW50IC0+IHZhcmlhYmxlIC0+IGlkZW50aWZpZXIK Pgo+IEJ1dCBpbiBhbGwgaG9uZXN0eSB0aGUgbGluZWFnZSBoYXNuJ3QgYmVlbiBzdXBlciBoZWxw ZnVsLgo+Cj4gX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18K PiBPTWV0YSBtYWlsaW5nIGxpc3QKPiBPTWV0YUB2cHJpLm9yZwo+IGh0dHA6Ly92cHJpLm9yZy9t YWlsbWFuL2xpc3RpbmZvL29tZXRhCj4KCgoKLS0gCkp1c3RpbiBDaGFzZQpodHRwOi8vd3d3Lmp1 c3Rpbm1jaGFzZS5jb20KLS0tLS0tLS0tLS0tLS0gbmV4dCBwYXJ0IC0tLS0tLS0tLS0tLS0tCkFu IEhUTUwgYXR0YWNobWVudCB3YXMgc2NydWJiZWQuLi4KVVJMOiBodHRwOi8vdnByaS5vcmcvcGlw ZXJtYWlsL29tZXRhL2F0dGFjaG1lbnRzLzIwMTExMjE0LzMwMDU2MDBlL2F0dGFjaG1lbnQuaHRt Cg== From cunningham.cb at gmail.com Wed Dec 14 10:15:19 2011 From: cunningham.cb at gmail.com (Chris Cunningham) Date: Wed Dec 14 10:14:28 2011 Subject: [Ometa] Debugging PEGs and Packrats In-Reply-To: <00D6B793-E724-4C5C-B217-010193DEB6C8@gmail.com> References: <00D6B793-E724-4C5C-B217-010193DEB6C8@gmail.com> Message-ID: On Tue, Dec 13, 2011 at 11:17 PM, Casey Ransberger wrote: > Debugging this stuff just seems really hard. And significantly harder than what I've experienced working with e.g. Yacc. > > Has anyone here found any technique(s) which makes debugging a grammar written for a PEG/packrat less of a pain in the butt? > > I'd be really interested in hearing about it. > A while back I was working on a PetitParser-hosted grammar, and the path that I found helped most with debugging was to write lots of unit tests. I would find as many 'leaf' areas of the grammar as I could and write tests for those areas, and work out from there into the more general areas - testing each bit as I went. I could usually track down the bugs I had encoded into the parser that way - and ended up with a nice beginning of a full test suite as well. -Chris From justin.m.chase at gmail.com Wed Dec 14 10:29:37 2011 From: justin.m.chase at gmail.com (Justin Chase) Date: Wed Dec 14 10:28:48 2011 Subject: [Ometa] Debugging PEGs and Packrats In-Reply-To: References: <00D6B793-E724-4C5C-B217-010193DEB6C8@gmail.com> Message-ID: KzEgdG8gdGhhdCwgYnJvdGhlci4gVW5pdCB0ZXN0IHRoZSBoZWNrIG91dCBvZiBldmVyeSBzaW5n bGUgbGl0dGxlIGZlYXR1cmUKb2YgeW91ciBsYW5ndWFnZS4gRXZlbiBvbmVzIHRoYXQgeW91IGhh dmVuJ3QgZG9uZSB5ZXQgYW5kIHdhdGNoIHRoZSBncmVlbgpsaWdodHMgbGlnaHQgdXAuCgoKT24g V2VkLCBEZWMgMTQsIDIwMTEgYXQgMTI6MTUgUE0sIENocmlzIEN1bm5pbmdoYW0KPGN1bm5pbmdo YW0uY2JAZ21haWwuY29tPndyb3RlOgoKPiBPbiBUdWUsIERlYyAxMywgMjAxMSBhdCAxMToxNyBQ TSwgQ2FzZXkgUmFuc2Jlcmdlcgo+IDxjYXNleS5vYnJpZW4uckBnbWFpbC5jb20+IHdyb3RlOgo+ ID4gRGVidWdnaW5nIHRoaXMgc3R1ZmYganVzdCBzZWVtcyByZWFsbHkgaGFyZC4gQW5kIHNpZ25p ZmljYW50bHkgaGFyZGVyCj4gdGhhbiB3aGF0IEkndmUgZXhwZXJpZW5jZWQgd29ya2luZyB3aXRo IGUuZy4gWWFjYy4KPiA+Cj4gPiBIYXMgYW55b25lIGhlcmUgZm91bmQgYW55IHRlY2huaXF1ZShz KSB3aGljaCBtYWtlcyBkZWJ1Z2dpbmcgYSBncmFtbWFyCj4gd3JpdHRlbiBmb3IgYSBQRUcvcGFj a3JhdCBsZXNzIG9mIGEgcGFpbiBpbiB0aGUgYnV0dD8KPiA+Cj4gPiBJJ2QgYmUgcmVhbGx5IGlu dGVyZXN0ZWQgaW4gaGVhcmluZyBhYm91dCBpdC4KPiA+Cj4gQSB3aGlsZSBiYWNrIEkgd2FzIHdv cmtpbmcgb24gYSBQZXRpdFBhcnNlci1ob3N0ZWQgZ3JhbW1hciwgYW5kIHRoZQo+IHBhdGggdGhh dCBJIGZvdW5kIGhlbHBlZCBtb3N0IHdpdGggZGVidWdnaW5nIHdhcyB0byB3cml0ZSBsb3RzIG9m IHVuaXQKPiB0ZXN0cy4gSSB3b3VsZCBmaW5kIGFzIG1hbnkgJ2xlYWYnIGFyZWFzIG9mIHRoZSBn cmFtbWFyIGFzICBJIGNvdWxkCj4gYW5kIHdyaXRlIHRlc3RzIGZvciB0aG9zZSBhcmVhcywgYW5k IHdvcmsgb3V0IGZyb20gdGhlcmUgaW50byB0aGUgbW9yZQo+IGdlbmVyYWwgYXJlYXMgLSB0ZXN0 aW5nIGVhY2ggYml0IGFzIEkgd2VudC4gIEkgY291bGQgdXN1YWxseSB0cmFjawo+IGRvd24gdGhl IGJ1Z3MgSSBoYWQgZW5jb2RlZCBpbnRvIHRoZSBwYXJzZXIgdGhhdCB3YXkgLSBhbmQgZW5kZWQg dXAKPiB3aXRoIGEgbmljZSBiZWdpbm5pbmcgb2YgYSBmdWxsIHRlc3Qgc3VpdGUgYXMgd2VsbC4K Pgo+IC1DaHJpcwo+Cj4gX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX18KPiBPTWV0YSBtYWlsaW5nIGxpc3QKPiBPTWV0YUB2cHJpLm9yZwo+IGh0dHA6Ly92cHJp Lm9yZy9tYWlsbWFuL2xpc3RpbmZvL29tZXRhCj4KCgoKLS0gCkp1c3RpbiBDaGFzZQpodHRwOi8v d3d3Lmp1c3Rpbm1jaGFzZS5jb20KLS0tLS0tLS0tLS0tLS0gbmV4dCBwYXJ0IC0tLS0tLS0tLS0t LS0tCkFuIEhUTUwgYXR0YWNobWVudCB3YXMgc2NydWJiZWQuLi4KVVJMOiBodHRwOi8vdnByaS5v cmcvcGlwZXJtYWlsL29tZXRhL2F0dGFjaG1lbnRzLzIwMTExMjE0L2M5MDBkMGIxL2F0dGFjaG1l bnQuaHRtCg== From jewel at subvert-the-dominant-paradigm.net Wed Dec 28 12:20:22 2011 From: jewel at subvert-the-dominant-paradigm.net (John Leuner) Date: Wed Dec 28 12:11:35 2011 Subject: [Ometa] Re: [fonc] Debugging PEGs and Packrats In-Reply-To: <00D6B793-E724-4C5C-B217-010193DEB6C8@gmail.com> References: <00D6B793-E724-4C5C-B217-010193DEB6C8@gmail.com> Message-ID: <1325103622.4249.54.camel@cmalu.WAG54GS> Hi Casey In my OMeta implementations I have found that simply recording the position of the deepest error and then printing out the remaining input text was sufficient to debug my grammars. I think the next step I would take (if that wasn't sufficient) is to create a test suite that tests each grammar rule independently, successively building up to the complex input that is failing. John On Tue, 2011-12-13 at 23:17 -0800, Casey Ransberger wrote: > I know this has come up before. Hopefully I'm not about to repeat a lot. > > Debugging this stuff just seems really hard. And significantly harder than what I've experienced working with e.g. Yacc. > > Hypothesis: Yacc had a lot of time to bake before I ever found it. PEGs are new, so there's been less overall experience with debugging them. > > I've experimented in what little time I can devote with OMeta, PetitParser, and Treetop. The debugging experience has been roughly consistent across all three. > > One particular issue which has bugged me: memoization seems to carry a lot of instance-state that's really hard to comprehend when the grammar isn't working as I expect. It's just really hard to use that ocean of information to figure out what I've done wrong. > > Given that with these new parsing technologies, we're pretty lucky to see "parse error" as an error message, I can't help but think that it's worth studying debugging strategies. Heh. :D I'm really not complaining, I'm just pointing it out. > > Has anyone here found any technique(s) which makes debugging a grammar written for a PEG/packrat less of a pain in the butt? > > I'd be really interested in hearing about it. > > > > _______________________________________________ > fonc mailing list > fonc@vpri.org > http://vpri.org/mailman/listinfo/fonc From anand.prabhakar.patil at gmail.com Fri Dec 30 04:49:23 2011 From: anand.prabhakar.patil at gmail.com (Anand Patil) Date: Fri Dec 30 04:48:47 2011 Subject: [Ometa] Stateful parsing and memoization Message-ID: Hi all, I was just wondering whether it's possible to switch off memoization for individual rules. A schematic of the problem follows: rule1 =3D expr:e ?mutating_function_1(e, self.state) -> 'r1' rule2 =3D expr:e ?mutating_function_2(e, self.state) -> 'r2' top =3D rule1 | rule2 | rule1 If I'm understanding correctly, the third rule1 in top isn't doing anything, because the failure of the first rule1 has been memoized. But the attempt to match rule2 has changed self.state, so the third rule1 might actually match even if the first one doesn't. Can I make it so rule1 actually does get tried a second time? With best wishes, Anand -------------- next part -------------- An HTML attachment was scrubbed... URL: http://vpri.org/pipermail/ometa/attachments/20111230/87a308c5/attachme= nt.htm From anand.prabhakar.patil at gmail.com Fri Dec 30 05:06:59 2011 From: anand.prabhakar.patil at gmail.com (Anand Patil) Date: Fri Dec 30 05:06:22 2011 Subject: [Ometa] Re: Stateful parsing and memoization In-Reply-To: References: Message-ID: One way to get this to work seems to be to pass the state around as a parameter: rule1 :state =3D expr:e ?mutating_function_1(e, state) -> 'r1' rule2 :state =3D expr:e ?mutating_function_2(e, state) -> 'r2' top_ :state =3D rule1(state) | rule2(state) top =3D top_([]) It looks like all the parameters get considered when deciding whether a rule matches, which makes sense. It's nice that I can now eliminate the third rule1. Is that the best way to do this? Thanks, Anand On Fri, Dec 30, 2011 at 12:49 PM, Anand Patil < anand.prabhakar.patil@gmail.com> wrote: > Hi all, > > I was just wondering whether it's possible to switch off memoization for > individual rules. A schematic of the problem follows: > > rule1 =3D expr:e ?mutating_function_1(e, self.state) -> 'r1' > rule2 =3D expr:e ?mutating_function_2(e, self.state) -> 'r2' > top =3D rule1 | rule2 | rule1 > > If I'm understanding correctly, the third rule1 in top isn't doing > anything, because the failure of the first rule1 has been memoized. But t= he > attempt to match rule2 has changed self.state, so the third rule1 might > actually match even if the first one doesn't. Can I make it so rule1 > actually does get tried a second time? > > With best wishes, > Anand > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://vpri.org/pipermail/ometa/attachments/20111230/557586a5/attachme= nt.htm From justin.m.chase at gmail.com Fri Dec 30 15:18:00 2011 From: justin.m.chase at gmail.com (Justin Chase) Date: Fri Dec 30 15:17:05 2011 Subject: [Ometa] Re: Stateful parsing and memoization In-Reply-To: References: Message-ID: SXQgbWlnaHQgYmUgc2FpZCB0aGF0IG11dGF0aW5nIHRoZSBzdGF0ZSBkdXJpbmcgcGFyc2luZyBp cyBub3QgYSBnb29kIGlkZWEKaWYgaXQgY2FuIGF0IGFsbCBiZSBhdm9pZGVkLiBDYW4geW91IGVs YWJvcmF0ZSBvbiB3aHkgeW91J3JlIGRvaW5nIHRoYXQ/IEkKYmVsaWV2ZSB0aGF0IG11dGF0aW9u IGludHJvZHVjZXMgZmFyIG1vcmUgY29tcGxleGl0eSB0aGFuIGp1c3QgdGhpcyBvbmUKcHJvYmxl bSBhbmQgbWF5IG1ha2UgdGhlIGVudGlyZSBwcm9zcGVjdCBvZiBwYXJzaW5nIG5vbi1kZXRlcm1p bmlzdGljLgoKT24gRnJpLCBEZWMgMzAsIDIwMTEgYXQgNzowNiBBTSwgQW5hbmQgUGF0aWwgPAph bmFuZC5wcmFiaGFrYXIucGF0aWxAZ21haWwuY29tPiB3cm90ZToKCj4gT25lIHdheSB0byBnZXQg dGhpcyB0byB3b3JrIHNlZW1zIHRvIGJlIHRvIHBhc3MgdGhlIHN0YXRlIGFyb3VuZCBhcyBhCj4g cGFyYW1ldGVyOgo+Cj4gcnVsZTEgOnN0YXRlID0gZXhwcjplID9tdXRhdGluZ19mdW5jdGlvbl8x KGUsIHN0YXRlKSAtPiAncjEnCj4gcnVsZTIgOnN0YXRlID0gZXhwcjplID9tdXRhdGluZ19mdW5j dGlvbl8yKGUsIHN0YXRlKSAtPiAncjInCj4gdG9wXyA6c3RhdGUgPSBydWxlMShzdGF0ZSkgfCBy dWxlMihzdGF0ZSkKPiB0b3AgPSB0b3BfKFtdKQo+Cj4gSXQgbG9va3MgbGlrZSBhbGwgdGhlIHBh cmFtZXRlcnMgZ2V0IGNvbnNpZGVyZWQgd2hlbiBkZWNpZGluZyB3aGV0aGVyIGEKPiBydWxlIG1h dGNoZXMsIHdoaWNoIG1ha2VzIHNlbnNlLiBJdCdzIG5pY2UgdGhhdCBJIGNhbiBub3cgZWxpbWlu YXRlIHRoZQo+IHRoaXJkIHJ1bGUxLiBJcyB0aGF0IHRoZSBiZXN0IHdheSB0byBkbyB0aGlzPwo+ Cj4gVGhhbmtzLAo+IEFuYW5kCj4KPiBPbiBGcmksIERlYyAzMCwgMjAxMSBhdCAxMjo0OSBQTSwg QW5hbmQgUGF0aWwgPAo+IGFuYW5kLnByYWJoYWthci5wYXRpbEBnbWFpbC5jb20+IHdyb3RlOgo+ Cj4+IEhpIGFsbCwKPj4KPj4gSSB3YXMganVzdCB3b25kZXJpbmcgd2hldGhlciBpdCdzIHBvc3Np YmxlIHRvIHN3aXRjaCBvZmYgbWVtb2l6YXRpb24gZm9yCj4+IGluZGl2aWR1YWwgcnVsZXMuIEEg c2NoZW1hdGljIG9mIHRoZSBwcm9ibGVtIGZvbGxvd3M6Cj4+Cj4+IHJ1bGUxID0gZXhwcjplID9t dXRhdGluZ19mdW5jdGlvbl8xKGUsIHNlbGYuc3RhdGUpIC0+ICdyMScKPj4gcnVsZTIgPSBleHBy OmUgP211dGF0aW5nX2Z1bmN0aW9uXzIoZSwgc2VsZi5zdGF0ZSkgLT4gJ3IyJwo+PiB0b3AgPSBy dWxlMSB8IHJ1bGUyIHwgcnVsZTEKPj4KPj4gSWYgSSdtIHVuZGVyc3RhbmRpbmcgY29ycmVjdGx5 LCB0aGUgdGhpcmQgcnVsZTEgaW4gdG9wIGlzbid0IGRvaW5nCj4+IGFueXRoaW5nLCBiZWNhdXNl IHRoZSBmYWlsdXJlIG9mIHRoZSBmaXJzdCBydWxlMSBoYXMgYmVlbiBtZW1vaXplZC4gQnV0IHRo ZQo+PiBhdHRlbXB0IHRvIG1hdGNoIHJ1bGUyIGhhcyBjaGFuZ2VkIHNlbGYuc3RhdGUsIHNvIHRo ZSB0aGlyZCBydWxlMSBtaWdodAo+PiBhY3R1YWxseSBtYXRjaCBldmVuIGlmIHRoZSBmaXJzdCBv bmUgZG9lc24ndC4gQ2FuIEkgbWFrZSBpdCBzbyBydWxlMQo+PiBhY3R1YWxseSBkb2VzIGdldCB0 cmllZCBhIHNlY29uZCB0aW1lPwo+Pgo+PiBXaXRoIGJlc3Qgd2lzaGVzLAo+PiBBbmFuZAo+Pgo+ Cj4KPiBfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwo+IE9N ZXRhIG1haWxpbmcgbGlzdAo+IE9NZXRhQHZwcmkub3JnCj4gaHR0cDovL3Zwcmkub3JnL21haWxt YW4vbGlzdGluZm8vb21ldGEKPgo+CgoKLS0gCkp1c3RpbiBDaGFzZQpodHRwOi8vd3d3Lmp1c3Rp bm1jaGFzZS5jb20KLS0tLS0tLS0tLS0tLS0gbmV4dCBwYXJ0IC0tLS0tLS0tLS0tLS0tCkFuIEhU TUwgYXR0YWNobWVudCB3YXMgc2NydWJiZWQuLi4KVVJMOiBodHRwOi8vdnByaS5vcmcvcGlwZXJt YWlsL29tZXRhL2F0dGFjaG1lbnRzLzIwMTExMjMwL2FkYzJkMGJkL2F0dGFjaG1lbnQuaHRtCg==