Skip to content

Allow reporting locations of semantic errors #57

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
astralarya opened this issue Oct 22, 2014 · 6 comments
Closed

Allow reporting locations of semantic errors #57

astralarya opened this issue Oct 22, 2014 · 6 comments

Comments

@astralarya
Copy link
Contributor

It would be nice if there were a mechanism for reporting the locations of semantic errors (e.g. incorrect value type). It looks like the functionality is almost there with Value::getOffsetStart(). Perhaps something like:

void Reader::pushError(const Value& value, std::string msg);

which would use the Value offset and Reader::begin_ to calculate a Reader::Location and push_back a Reader::ErrorInfo onto Reader::errors_. The error could then be printed with a call to Reader::getFormattedErrorMessages(). Perhaps there could be a:

void Reader::pushError(const Value& value, std::string msg, const Value& extra);

that also initializes Reader::ErrorInfo::extra_.

Of course then it would be nice to have a function

bool Reader::good() const;

to return !Reader::errors_.size().

Soooo, I guess I basically could implement this right now....lol. Expect a pull request shortly.

@cdunn2001
Copy link
Contributor

Are you using this? I want to decouple the Value from the start/limit info used only during parsing, for both clarity and memory efficiency. For now, I want to delete pushError, which is not used here. Could you explain the use case?

@astralarya
Copy link
Contributor Author

I am currently using this functionality, but I don't see why we couldn't decouple the start/limit info if we stored the row/column information in the Values generated while parsing. As for my current usage, here is an example. Warning, the example below could be overwhelming.

We define a Thing struct and a parseThing function that is suitable to be called at any point during JSON parsing that takes a Thing object and returns a boolean success flag.

struct Thing {
  std::string author;
  std::vector<std::string> labels;
};

bool parseThing(
  Json::Reader& reader,
  const Json::Value& root,
  Thing& aThing) {
    bool good = true;
    if(root.isObject()) {
      for(const auto& key : root.getMemberNames()) {
        const auto& member = root[key];
        if(key == "AUTHOR") {
          if(member.isString()) {
            aThing.author = member.asString();
          } else {
            reader.pushError(member,"AUTHOR must be a string");
            good = false;
          }
        } else if(key == "LABELS") {
          if(member.isArray()) {
            for(const auto& value : member) {
              if(value.isString()) {
                aThing.labels.push_back(value.asString());
              } else {
                reader.pushError(value,"LABELS array elements must be strings");
                good = false;
              }
            }
          } else if(member.isString()) {
            aThing.labels.push_back(member.asString());
          } else {
            reader.pushError(member,"LABELS must be an array or string");
            good = false;
          }
        } else {
          reader.pushError(member,"Unknown attribute: " + key);
          good = false;
        }
      }
    } else {
      reader.pushError(root,"Thing must be an object");
      good = false;
    }
    return good;
  }

Example usage:

int JSONparser::parse(std::ifstream& stream) {
  Json::Value root;
  if(!_reader.parse(stream,root)) {
    // report to the user the failure and their locations in the document.
    std::cerr << "Failed to parse file: " << file << '\n'
              << _reader.getFormattedErrorMessages();
    return 1;
  }
  std::vector<Thing> myThings;
  /* this section could be more complex in practice */
  if(root.isObject()) {
    Thing aThing;
    if(parseThing(reader,root,aThing)) {
      myThings.push_back(aThing);
    }
  } else if(root.isArray()) {
    /* blah blah */
  }
  if(!_reader.good()) {
    // Will be true if any Things failed to parse
    std::cerr << _reader.getFormattedErrorMessages();
    return 1;
  }
  /*
  DO STUFF WITH myThings
  */
  return 0;
}

Benefits:

  • Semantic errors while parsing Things report their locations in the input stream and print themselves using the existing print mechanism.
  • Semantic parsing of Things are encapsulated and reusable.

Areas for improvement:

  • Currently, Reader must be passed into parseThing. The Value could potentially be associated directly with the generating Reader, simplifying code; however, this isn't really a big deal.
  • Simple type checks could be made easier, for example, an assertString() function. However, this would not be useful for values that could be a string or array of strings. We could consider assert statements similar to the Chai JS Assertion library (http://chaijs.com/api/).

@cdunn2001
Copy link
Contributor

I don't oppose semantic parsing in principle, but I wonder whether you wouldn't be better off with XML. JSON has no semantic information; it's just a data-interchange format. Even in your example the entire JSON document is read before parseThing() is called. With XML, you could process the semantics during the parsing.

Why can't you wrap a Json::Value with your own class and include pushError() there?

This odd method stands in the way of a more efficient JSON parser. How strongly would you object to its removal in v1.4.0? Or at least in v2.0.0?

@astralarya
Copy link
Contributor Author

I wouldn't object to it's removal, but it would be nice if there were methods to facilitate semantic assertions. The "right" way would probably be to define semantic constraints before parsing. It might even be possible to implement parse-time no-copy initialization of STL and user defined classes.

That being said, c++ is a systems programming language. Whatever needs to be done to improve efficiency is fine by me.

@cdunn2001
Copy link
Contributor

Nevermind. It's a harmless addition to the current Reader. Someday, we might have a new Reader without any error-reporting, but it's fine for now.

@astralarya
Copy link
Contributor Author

What would you think of a SemanticReader class with the ability to define rules before parsing? I'm not yet familiar enough with the Reader code to be sure about the implementation of no-copy initialization, but I have some ideas for allowing user defined semantics if you were interested. I'm thinking something along the lines of

// parser state API object
class Json::ReaderState;
// Trigger function signature
typename std::function<int (Json::ReaderState)> Json::TriggerFunc;
// Add a parsing hook
Json::SemanticReader::registerTrigger(
  Json::ValueType type,
  Json::TriggerFunc func);

registerTrigger would push the TriggerFuncs onto a vector that could be for_each invoked after a Value of type is parsed. There are possibly some more parameters to registerTrigger that would make sense (an object key filter?). Obviously this could severely impact performance depending on the quality and number of the registered functions, but I think it would be acceptable considering the functionality would be separated into SemanticReader.

Example usage: Reading in vectors of doubles

Json::SemanticReader aReader;
std::vector<std::vector<double>> myLists;
aReader.registerTrigger(
  Json::ValueType::arrayValue,
  [&myLists](ReaderState s) {
    if(std::all_of(
      s.value.begin(), s.value.end(),
      [](const Value& v){return v.isDouble();}))
    {
      // if we have a list of doubles
      std::vector<double> aVector;
      // transform values into aVector
      std::transform(s.value.begin(),s.value.end(),
        std::back_inserter(aVector),
        [](const Value& v){return v.asDouble();});
      myLists.push_back(aVector);
      return 0;
    } else {
      // else report semantic error
      s.push_error("Not a list of doubles");
      return 1;
    }
  });

cdunn2001 added a commit to cdunn2001/jsoncpp that referenced this issue Feb 9, 2015
cdunn2001 added a commit to cdunn2001/jsoncpp that referenced this issue Feb 10, 2015
cdunn2001 added a commit to cdunn2001/jsoncpp that referenced this issue Feb 11, 2015
cdunn2001 added a commit to cdunn2001/jsoncpp that referenced this issue Feb 11, 2015
cdunn2001 added a commit that referenced this issue Feb 11, 2015
for binary-compatibility with 0.6.0
issue #147
was #57
cdunn2001 added a commit that referenced this issue Feb 15, 2015
for binary-compatibility with 0.6.0
issue #147
was #57
cdunn2001 added a commit that referenced this issue Feb 18, 2015
for binary-compatibility with 0.6.0
issue #147
was #57
cdunn2001 added a commit that referenced this issue Mar 3, 2015
for binary-compatibility with 0.6.0
issue #147
was #57
cdunn2001 added a commit to cdunn2001/jsoncpp that referenced this issue Mar 3, 2015
cdunn2001 added a commit that referenced this issue Mar 3, 2015
for binary-compatibility with 0.6.0
issue #147
was #57
cdunn2001 added a commit that referenced this issue Mar 5, 2015
for binary-compatibility with 0.6.0
issue #147
was #57
cdunn2001 added a commit that referenced this issue Mar 5, 2015
for binary-compatibility with 0.6.0
issue #147
was #57
cdunn2001 added a commit that referenced this issue Mar 6, 2015
for binary-compatibility with 0.6.0
issue #147
was #57
cdunn2001 added a commit that referenced this issue Mar 7, 2015
for binary-compatibility with 0.6.0
issue #147
was #57
cdunn2001 added a commit that referenced this issue Mar 8, 2015
for binary-compatibility with 0.6.0
issue #147
was #57
cdunn2001 added a commit that referenced this issue Mar 9, 2015
for binary-compatibility with 0.6.0
issue #147
was #57
cdunn2001 added a commit that referenced this issue Mar 12, 2015
for binary-compatibility with 0.6.0
issue #147
was #57
cdunn2001 added a commit that referenced this issue Mar 15, 2015
for binary-compatibility with 0.6.0
issue #147
was #57
cdunn2001 added a commit that referenced this issue Mar 15, 2015
for binary-compatibility with 0.6.0
issue #147
was #57
cdunn2001 added a commit that referenced this issue Mar 31, 2015
for binary-compatibility with 0.6.0
issue #147
was #57
cdunn2001 added a commit that referenced this issue Apr 11, 2015
for binary-compatibility with 0.6.0
issue #147
was #57
cdunn2001 added a commit to cdunn2001/jsoncpp that referenced this issue Apr 19, 2015
cdunn2001 added a commit to cdunn2001/jsoncpp that referenced this issue Jun 19, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants