Ga direct naar


How do I write my own parser? (for JSON)

Friday 05 December 2008 09:00

If no parser is available for the file you need, writing one yourself may be easier than you think. What file-structures are managable? What would be the design of such a parser? How do you make sure it is complete? Here we describe the process for building a JSON parser in C#, and issue the source code.

By Patrick van Bergen

[Download the JSON parser / generator for C#]

The software is subject to the MIT license: you are free to use it in any way you like, but it must keep its license.

For our synchronisation-module (which we use to synchronize data between diverse business applications) we chose JSON for data exchange. JSON is just a little better suited for a PHP web-environment than XML, because:

  • The PHP functions json_encode() and json_decode() allow you to convert data structures from and to JSON strings
  • JSON can be sent directly to the browser in an Ajax request
  • It takes up less space than XML, which is important in server > browser traffic.
  • A JSON string can be composed of only ASCII characters, while still being able to express all UNICODE characters, thus avoiding all possible conversion issues a transport may carry.

So JSON is very convenient for PHP. But of course we wanted to be able to synchronize with Windows applications as well, and because C# is better suited to this environment, this part of the module was written in this language. The .Net framework just didn't have its own JSON parser / encoder and the open-source software written for this task often contained a whole package of classes and constraints and sometimes the JSON implementation wasn't even complete.

We just wanted a single class that could be imported and that used the most basic building blocks of our application: the ArrayList and the Hashtable. Also, all aspects of JSON should would have to be implemented, there should a JSON generator, and of course it should be fast.

More reasons to write our own parser weren't necessary. Writing a parser happens to be a very thing satisfying to do. It is the best way to learn a new programming language thoroughly. Especially if you're using unit-testing to guarantee the parser / generator matches the language specification exactly. JSON's specification is easy to find. The website http://www.json.org/ is as clear as one could wish for.

You start by writing the unit-tests. You should really write all test before starting the implementation, but such patience is seldomly found in a programmer. You can at least start by writing some obvious tests that help you to create a consistent API. This is an example of a simple object-test:

string json;
Hashtable o;
bool success = true;

json = "{\"name\":123,\"name2\":-456e8}";
o = (Hashtable)JSON.JsonDecode(json);
success = success && ((double)o["name"] == 123);
success = success && ((double)o["name2"] == -456e8);

Eventually you should write all tests needed to check all aspects of the language, because your users (other programmers) will assume that the parser just works.

OK. Parsers. Parsers are associated with specialized software: so called compiler compilers (of which Yacc is the most well known). Using this software will make sure that the parser will be fast, but it does not do all the work for you. What's more, it can be even easier to write the entire parser yourself than to do all the preparatoy work for the cc.

The compiler compiler is needed for languages with a high level of ambiguity. A language expression is parsed from-left-to-right. If a language contains many structures that cannot be identified at the start of te parse, it is advisable to use a tool that is able to manage the emerging complexity.

Unambiguous languages are better suitable for building the parser manually, using recursive functions to process the recursive nature of the language. The parser looks ahead one or more tokens to identify the next construct. For JSON it is even sufficient to look ahead a single token. This classifies it as an LL(1) language (see also http://en.wikipedia.org/wiki/LL_parser).

A parser takes as input a string of tokens. Tokens are the most elementary building blocks of a language, like "+", "{", "[", but also complete numbers like "-1.345e5" and strings like "'The scottish highlander looked around.'". The parse-phase is usually preceded by a tokenization phase. In our JSON parser this step is integrated in the parser, because to determine the next token, in almost all cases, it is enough to just read the next character in the string. This saves the allocation of a token table in memory.

The parser takes a string as input and returns a C# datastructure, consisting of ArrayLists, Hashtables, a number of scalar value types and null. The string is processed from left-to-right. An index (pointer) keeps track of the current position in the string at any moment. At each level of the parse process the parser performs these steps:

  • Look ahead 1 token to determine the type of the next construct
  • Choose the function to parse the construct
  • Call this function and integrate the returned value in the construct that is currently built.

A nice example is the recursive function "ParseObject" that parses an object:

protected Hashtable ParseObject(char[] json, ref int index)
{
Hashtable table = new Hashtable();
int token;

// {
NextToken(json, ref index);

bool done = false;
while (!done) {
token = LookAhead(json, index);
if (token == JSON.TOKEN_NONE) {
return null;
} else if (token == JSON.TOKEN_COMMA) {
NextToken(json, ref index);
} else if (token == JSON.TOKEN_CURLY_CLOSE) {
NextToken(json, ref index);
return table;
} else {

// name
string name = ParseString(json, ref index);
if (name == null) {
return null;
}

// :
token = NextToken(json, ref index);
if (token != JSON.TOKEN_COLON) {
return null;
}

// value
bool success = true;
object value = ParseValue(json, ref index, ref success);
if (!success) {
return null;
}

table[name] = value;
}
}

return table;
}

The function is only called if a look ahead has determined that a construct starts with an opening curly brace. So this token may be skipped. Next, the string is parsed just as long as the closing brace is not found, or the end of the string is found (a syntax error, but one that needs to be caught). Between the braces there are a number of "'name': value" pairs, separated by comma's. This algorithm is can be found literally in the function, which makes it very insightful and thus easy to debug. The function builds an ArrayList and returns it to the calling function. The parser mainly consists of these types of functions.

If you create your own parser, you will always need to take into account that the incoming string may be grammatically incorrect. Users expect the parser to be able to tell on which line the error occurred. Our parser only remembers the index, but it also contains an extra function that returns the immediate context of the position of the error, comparable to the error messages that MySQL generates.

If you want to know more about parsers, it is good to know there consists a een standard work on this subject, that recently (2006) saw its second version:

Compilers: principles, techniques, and tools, Aho, A.V., Sethi, R. and Ullman ,J.D. (1986)

« Back

Reactions on "How do I write my own parser? (for JSON)"

1 2 3 4 5 6 Last page
garfix
Placed on: 08-16-2010 12:23
Patrick van Bergen
User icon
to be continuum
@BananielTheSpaniel, I looked at your contribution and I agree it is interesting. However, I want to keep the code of the project to a minimum (it's the charm of the implementation) and indentation and sorting are not required by JSON.
Edited
garfix has edited this message on: 08-16-2010 12:23
zamkinos
Placed on: 08-22-2010 10:44
thanks
philipp
Placed on: 10-03-2010 21:05
When using JsonEncode() with a Hashtable or ArrayList which contains one or more not serializable
entries it doesn't return null (as in documentation above method) but a string with incomplete JSON.
Reason for this behaviour is that SerializeValue() doesn't use the return values of all Serialize*.
Maybe this was introduced by refactoring because there is also an unused SerializeObjectOrArray().

Aside from this small flaw the decoder/encoder is really useful. Thanks. Smile
garfix
Placed on: 10-05-2010 09:47
Patrick van Bergen
User icon
to be continuum
Thanks Philipp,

I fixed the bug you named. It wasn't caused by a refactoring action; it never worked. I simply hadn't written a unit test for encoding an invalid input.

--

I removed the expando code, because the original submitter did not respond to my request of testing it and I want to keep this class as simple as possible anyway.
veggieCoder
Placed on: 11-03-2010 18:20
Thanks very much for sharing this code! I'm parsing JSON responses from a server that contain nested objects and arrays. How can I get to specific data points that might be several layers deep? Does the parser already contain that functionality, or would I need to write some kind of recursive function myself?
garfix
Placed on: 11-03-2010 18:56
Patrick van Bergen
User icon
to be continuum
Hello veggieCoder,

The parser merely returns the nested hashtable / array structure. It has no xpath-like functions itself.
mr_miles
Placed on: 11-19-2010 15:40
Hi,

Thanks for this code - it's exactly the sort of simple lightweight approach I was looking for.

I've modified it so that I can feed in a StreamReader, rather than read an entire string in. Are you interested in a patch to update the code?

Thanks again
garfix
Placed on: 11-20-2010 21:44
Patrick van Bergen
User icon
to be continuum
Hello mr_miles,

Your welcome! And indeed I like to keep the code as simple as possible. Can you place the modified code here? Anyone who's interested can then integrated it.
Cleveland Mark Blakemore
Placed on: 11-24-2010 01:12
Rocks the house! A JSON parser in 2kb. That's uber-monstrously useful for converting this format into and out of C# objects. I used this code in the .NET Micro Framework and it eliminated heaps of useless XML calls for the 128 kb NETDuino code space. At one point I was using 34K on the XML parsing code alone. I am building a message intensive decentralized terminal and any brevity in the parsing of these messages is much appreciated.

Note this outperformed both XML and Windows .INI format in both memory requirements and parsing speed.

Thanks for this.
garfix
Placed on: 11-24-2010 08:45
Patrick van Bergen
User icon
to be continuum
Thanks, Cleveland Mark,

That's cool stuff you are working on. Keep it up!
Patch to add support for a stream
Placed on: 11-30-2010 12:52
Hi,

Here's a unified diff to allow a stream to be passed to the parser. Looks like a lot of changes, but only really to pass the stream object around - was actually pretty simple.

--- c:/JSON-orig.cs Tue Nov 30 11:34:14 2010
+++ c:/JSON-withStreamSupport.cs Tue Nov 30 11:47:39 2010
@@ -1,96 +1,141 @@
-using System;
+using System;
using System.Data;
using System.Collections;
using System.Globalization;
using System.Text;
+using System.IO;

+///XXX: modified to allow support for parsing a stream
namespace Procurios.Public
{
- /// <summary>
- /// This class encodes and decodes JSON strings.
- /// Spec. details, see http://www.json.org/
- ///
- /// JSON uses Arrays and Objects. These correspond here to the datatypes ArrayList and Hashtable.
- /// All numbers are parsed to doubles.
- /// </summary>
- public class JSON
- {
- public const int TOKEN_NONE = 0;
- public const int TOKEN_CURLY_OPEN = 1;
- public const int TOKEN_CURLY_CLOSE = 2;
- public const int TOKEN_SQUARED_OPEN = 3;
- public const int TOKEN_SQUARED_CLOSE = 4;
- public const int TOKEN_COLON = 5;
- public const int TOKEN_COMMA = 6;
- public const int TOKEN_STRING = 7;
- public const int TOKEN_NUMBER = 8;
- public const int TOKEN_TRUE = 9;
- public const int TOKEN_FALSE = 10;
- public const int TOKEN_NULL = 11;
-
- private const int BUILDER_CAPACITY = 2000;
-
- /// <summary>
- /// Parses the string json into a value
- /// </summary>
- /// <param name="json">A JSON string.</param>
- /// <returns>An ArrayList, a Hashtable, a double, a string, null, true, or false</returns>
- public static object JsonDecode(string json)
- {
- bool success = true;
-
- return JsonDecode(json, ref success);
- }
-
- /// <summary>
- /// Parses the string json into a value; and fills 'success' with the successfullness of the parse.
- /// </summary>
- /// <param name="json">A JSON string.</param>
- /// <param name="success">Successful parse?</param>
- /// <returns>An ArrayList, a Hashtable, a double, a string, null, true, or false</returns>
- public static object JsonDecode(string json, ref bool success)
- {
- success = true;
- if (json != null) {
- char[] charArray = json.ToCharArray();
- int index = 0;
- object value = ParseValue(charArray, ref index, ref success);
- return value;
- } else {
- return null;
- }
- }
+ /// <summary>
+ /// This class encodes and decodes JSON strings.
+ /// Spec. details, see http://www.json.org/
+ ///
+ /// JSON uses Arrays and Objects. These correspond here to the datatypes ArrayList and Hashtable.
+ /// All numbers are parsed to doubles.
+ /// </summary>
+ public class JSON
+ {
+ public const int TOKEN_NONE = 0;
+ public const int TOKEN_CURLY_OPEN = 1;
+ public const int TOKEN_CURLY_CLOSE = 2;
+ public const int TOKEN_SQUARED_OPEN = 3;
+ public const int TOKEN_SQUARED_CLOSE = 4;
+ public const int TOKEN_COLON = 5;
+ public const int TOKEN_COMMA = 6;
+ public const int TOKEN_STRING = 7;
+ public const int TOKEN_NUMBER = 8;
+ public const int TOKEN_TRUE = 9;
+ public const int TOKEN_FALSE = 10;
+ public const int TOKEN_NULL = 11;
+
+ private const int BUILDER_CAPACITY = 2000;
+
+ /// <summary>
+ /// Parses the string json into a value
+ /// </summary>
+ /// <param name="json">A JSON string.</param>
+ /// <returns>An ArrayList, a Hashtable, a double, a string, null, true, or false</returns>
+ public static object JsonDecode(string json)
+ {
+ bool success = true;
+
+ return JsonDecode(json, ref success);
+ }
+
+ public static object JsonDecode(FileInfo fi)
+ {
+ bool success = true;
+ StreamReader reader;
+ using (reader = new StreamReader(fi.FullName))
+ {
+ {
+ return JsonDecode(ref reader, null, ref success);
+ }
+ }
+ }
+
+ public static object JsonDecode(Stream strm)
+ {
+ bool success = true;
+ StreamReader reader;
+ using (reader = new StreamReader(strm))
+ {
+ return JsonDecode(ref reader, null, ref success);
+ }
+ }
+
+ /// <summary>
+ /// Parses the string json into a value; and fills 'success' with the successfullness of the parse.
+ /// </summary>
+ /// <param name="json">A JSON string.</param>
+ /// <param name="success">Successful parse?</param>
+ /// <returns>An ArrayList, a Hashtable, a double, a string, null, true, or false</returns>
+ public static object JsonDecode(string json, ref bool success)
+ {
+ StreamReader reader = null;
+ return JsonDecode(ref reader, json, ref success);
+ }
+
+ public static object JsonDecode(ref StreamReader reader, string json, ref bool success)
+ {
+ success = true;
+ if (reader != null)
+ {
+ if (reader.Peek() > -1)
+ json = reader.ReadLine();
+ }
+
+ if (json != null)
+ {
+ char[] charArray = json.ToCharArray();
+ int index = 0;
+ object value = ParseValue(ref reader, ref charArray, ref index, ref success);
+ return value;
+ }
+ else
+ {
+ return null;
+ }
+ }
+
+ /// <summary>
+ /// Converts a Hashtable / ArrayList object into a JSON string
+ /// </summary>
+ /// <param name="json">A Hashtable / ArrayList</param>
+ /// <returns>A JSON encoded string, or null if object 'json' is not serializable</returns>
+ public static string JsonEncode(object json)
+ {
+ StringBuilder builder = new StringBuilder(BUILDER_CAPACITY);
+ bool success = SerializeValue(json, builder);
+ return (success ? builder.ToString() : null);
+ }
+
+ protected static Hashtable ParseObject(char[] json, ref int index, ref bool success)
+ {
+ StreamReader reader = null;
+ return ParseObject(ref reader, ref json, ref index, ref success);
+ }

- /// <summary>
- /// Converts a Hashtable / ArrayList object into a JSON string
- /// </summary>
- /// <param name="json">A Hashtable / ArrayList</param>
- /// <returns>A JSON encoded string, or null if object 'json' is not serializable</returns>
- public static string JsonEncode(object json)
- {
- StringBuilder builder = new StringBuilder(BUILDER_CAPACITY);
- bool success = SerializeValue(json, builder);
- return (success ? builder.ToString() : null);
- }
-
- protected static Hashtable ParseObject(char[] json, ref int index, ref bool success)
+ protected static Hashtable ParseObject(ref StreamReader reader, ref char[] json, ref int index, ref bool success)
{
Hashtable table = new Hashtable();
int token;

// {
- NextToken(json, ref index);
+ NextToken(ref reader, ref json, ref index);

bool done = false;
while (!done) {
- token = LookAhead(json, index);
+ token = LookAhead(ref reader, ref json, ref index);
if (token == JSON.TOKEN_NONE) {
success = false;
return null;
} else if (token == JSON.TOKEN_COMMA) {
- NextToken(json, ref index);
+ NextToken(ref reader, ref json, ref index);
} else if (token == JSON.TOKEN_CURLY_CLOSE) {
- NextToken(json, ref index);
+ NextToken(ref reader, ref json, ref index);
return table;
} else {

@@ -102,14 +147,14 @@
}

// :
- token = NextToken(json, ref index);
+ token = NextToken(ref reader, ref json, ref index);
if (token != JSON.TOKEN_COLON) {
success = false;
return null;
}

// value
- object value = ParseValue(json, ref index, ref success);
+ object value = ParseValue(ref reader, ref json, ref index, ref success);
if (!success) {
success = false;
return null;
@@ -122,26 +167,32 @@
return table;
}

- protected static ArrayList ParseArray(char[] json, ref int index, ref bool success)
+ protected static ArrayList ParseArray(char[] json, ref int index, ref bool success)
+ {
+ StreamReader reader = null;
+ return ParseArray(ref reader, ref json, ref index, ref success);
+ }
+
+ protected static ArrayList ParseArray(ref StreamReader reader, ref char[] json, ref int index, ref bool success)
{
ArrayList array = new ArrayList();

// [
- NextToken(json, ref index);
+ NextToken(ref reader, ref json, ref index);

bool done = false;
while (!done) {
- int token = LookAhead(json, index);
+ int token = LookAhead(ref reader, ref json, ref index);
if (token == JSON.TOKEN_NONE) {
success = false;
return null;
} else if (token == JSON.TOKEN_COMMA) {
- NextToken(json, ref index);
+ NextToken(ref reader, ref json, ref index);
} else if (token == JSON.TOKEN_SQUARED_CLOSE) {
- NextToken(json, ref index);
+ NextToken(ref reader, ref json, ref index);
break;
} else {
- object value = ParseValue(json, ref index, ref success);
+ object value = ParseValue(ref reader, ref json, ref index, ref success);
if (!success) {
return null;
}
@@ -153,154 +204,191 @@
return array;
}

- protected static object ParseValue(char[] json, ref int index, ref bool success)
- {
- switch (LookAhead(json, index)) {
- case JSON.TOKEN_STRING:
- return ParseString(json, ref index, ref success);
- case JSON.TOKEN_NUMBER:
- return ParseNumber(json, ref index);
- case JSON.TOKEN_CURLY_OPEN:
- return ParseObject(json, ref index, ref success);
- case JSON.TOKEN_SQUARED_OPEN:
- return ParseArray(json, ref index, ref success);
- case JSON.TOKEN_TRUE:
- NextToken(json, ref index);
- return Boolean.Parse("TRUE");
- case JSON.TOKEN_FALSE:
- NextToken(json, ref index);
- return Boolean.Parse("FALSE");
- case JSON.TOKEN_NULL:
- NextToken(json, ref index);
- return null;
- case JSON.TOKEN_NONE:
- break;
- }
-
- success = false;
- return null;
- }
-
- protected static string ParseString(char[] json, ref int index, ref bool success)
- {
- StringBuilder s = new StringBuilder(BUILDER_CAPACITY);
- char c;
-
- EatWhitespace(json, ref index);
-
- // "
- c = json[index++];
-
- bool complete = false;
- while (!complete) {
-
- if (index == json.Length) {
- break;
- }
-
- c = json[index++];
- if (c == '"') {
- complete = true;
- break;
- } else if (c == '\\') {
-
- if (index == json.Length) {
- break;
- }
- c = json[index++];
- if (c == '"') {
- s.Append('"');
- } else if (c == '\\') {
- s.Append('\\');
- } else if (c == '/') {
- s.Append('/');
- } else if (c == 'b') {
- s.Append('\b');
- } else if (c == 'f') {
- s.Append('\f');
- } else if (c == 'n') {
- s.Append('\n');
- } else if (c == 'r') {
- s.Append('\r');
- } else if (c == 't') {
- s.Append('\t');
- } else if (c == 'u') {
- int remainingLength = json.Length - index;
- if (remainingLength >= 4) {
- // fetch the next 4 chars
- char[] unicodeCharArray = new char[4];
- Array.Copy(json, index, unicodeCharArray, 0, 4);
- // parse the 32 bit hex into an integer codepoint
- uint codePoint = UInt32.Parse(new string(unicodeCharArray), NumberStyles.HexNumber);
- // convert the integer codepoint to a unicode char and add to string
- s.Append(Char.ConvertFromUtf32((int)codePoint));
- // skip 4 chars
- index += 4;
- } else {
- break;
- }
- }
-
- } else {
- s.Append(c);
- }
-
- }
-
- if (!complete) {
- success = false;
- return null;
- }
-
- return s.ToString();
- }
-
- protected static double ParseNumber(char[] json, ref int index)
- {
- EatWhitespace(json, ref index);
-
- int lastIndex = GetLastIndexOfNumber(json, index);
- int charLength = (lastIndex - index) + 1;
- char[] numberCharArray = new char[charLength];
-
- Array.Copy(json, index, numberCharArray, 0, charLength);
- index = lastIndex + 1;
- return Double.Parse(new string(numberCharArray), CultureInfo.InvariantCulture);
- }
-
- protected static int GetLastIndexOfNumber(char[] json, int index)
- {
- int lastIndex;
-
- for (lastIndex = index; lastIndex < json.Length; lastIndex++) {
- if ("0123456789+-.eE".IndexOf(json[lastIndex]) == -1) {
- break;
- }
- }
- return lastIndex - 1;
- }
-
- protected static void EatWhitespace(char[] json, ref int index)
- {
- for (; index < json.Length; index++) {
- if (" \t\n\r".IndexOf(json[index]) == -1) {
- break;
- }
- }
- }
-
- protected static int LookAhead(char[] json, int index)
- {
- int saveIndex = index;
- return NextToken(json, ref saveIndex);
- }
-
- protected static int NextToken(char[] json, ref int index)
- {
+ protected static object ParseValue(char[] json, ref int index, ref bool success)
+ {
+ StreamReader reader = null;
+ return ParseValue(ref reader, ref json, ref index, ref success);
+ }
+
+ protected static object ParseValue(ref StreamReader reader, ref char[] json, ref int index, ref bool success)
+ {
+ switch (LookAhead(ref reader, ref json, ref index)) {
+ case JSON.TOKEN_STRING:
+ return ParseString(json, ref index, ref success);
+ case JSON.TOKEN_NUMBER:
+ return ParseNumber(json, ref index);
+ case JSON.TOKEN_CURLY_OPEN:
+ return ParseObject(ref reader, ref json, ref index, ref success);
+ case JSON.TOKEN_SQUARED_OPEN:
+ return ParseArray(ref reader, ref json, ref index, ref success);
+ case JSON.TOKEN_TRUE:
+ NextToken(json, ref index);
+ return Boolean.Parse("TRUE");
+ case JSON.TOKEN_FALSE:
+ NextToken(json, ref index);
+ return Boolean.Parse("FALSE");
+ case JSON.TOKEN_NULL:
+ NextToken(json, ref index);
+ return null;
+ case JSON.TOKEN_NONE:
+ break;
+ }
+
+ success = false;
+ return null;
+ }
+
+ protected static string ParseString(char[] json, ref int index, ref bool success)
+ {
+ StringBuilder s = new StringBuilder(BUILDER_CAPACITY);
+ char c;
+
+ EatWhitespace(json, ref index);
+
+ // "
+ c = json[index++];
+
+ bool complete = false;
+ while (!complete) {
+
+ if (index == json.Length) {
+ break;
+ }
+
+ c = json[index++];
+ if (c == '"') {
+ complete = true;
+ break;
+ } else if (c == '\\') {
+
+ if (index == json.Length) {
+ break;
+ }
+ c = json[index++];
+ if (c == '"') {
+ s.Append('"');
+ } else if (c == '\\') {
+ s.Append('\\');
+ } else if (c == '/') {
+ s.Append('/');
+ } else if (c == 'b') {
+ s.Append('\b');
+ } else if (c == 'f') {
+ s.Append('\f');
+ } else if (c == 'n') {
+ s.Append('\n');
+ } else if (c == 'r') {
+ s.Append('\r');
+ } else if (c == 't') {
+ s.Append('\t');
+ } else if (c == 'u') {
+ int remainingLength = json.Length - index;
+ if (remainingLength >= 4) {
+ // fetch the next 4 chars
+ char[] unicodeCharArray = new char[4];
+ Array.Copy(json, index, unicodeCharArray, 0, 4);
+ // parse the 32 bit hex into an integer codepoint
+ uint codePoint = UInt32.Parse(new string(unicodeCharArray), NumberStyles.HexNumber);
+ // convert the integer codepoint to a unicode char and add to string
+ s.Append(Char.ConvertFromUtf32((int)codePoint));
+ // skip 4 chars
+ index += 4;
+ } else {
+ break;
+ }
+ }
+
+ } else {
+ s.Append(c);
+ }
+
+ }
+
+ if (!complete) {
+ success = false;
+ return null;
+ }
+
+ return s.ToString();
+ }
+
+ protected static double ParseNumber(char[] json, ref int index)
+ {
+ EatWhitespace(json, ref index);
+
+ int lastIndex = GetLastIndexOfNumber(json, index);
+ int charLength = (lastIndex - index) + 1;
+ char[] numberCharArray = new char[charLength];
+
+ Array.Copy(json, index, numberCharArray, 0, charLength);
+ index = lastIndex + 1;
+ return Double.Parse(new string(numberCharArray), CultureInfo.InvariantCulture);
+ }
+
+ protected static int GetLastIndexOfNumber(char[] json, int index)
+ {
+ int lastIndex;
+
+ for (lastIndex = index; lastIndex < json.Length; lastIndex++) {
+ if ("0123456789+-.eE".IndexOf(json[lastIndex]) == -1) {
+ break;
+ }
+ }
+ return lastIndex - 1;
+ }
+
+ protected static void EatWhitespace(char[] json, ref int index)
+ {
+ for (; index < json.Length; index++) {
+ if (" \t\n\r".IndexOf(json[index]) == -1) {
+ break;
+ }
+ }
+ }
+
+ protected static int LookAhead(char[] json, int index)
+ {
+ StreamReader reader = null;
+ return LookAhead(ref reader, ref json, ref index);
+ }
+
+ protected static int LookAhead(ref StreamReader reader, ref char[] json, ref int index)
+ {
+ int saveIndex = index;
+ bool resetIndex = false;
+ var nextToken = NextToken(ref reader, ref json, ref saveIndex, ref resetIndex);
+ if (resetIndex) index = 0;
+ return nextToken;
+ }
+
+ protected static int NextToken(char[] json, ref int index)
+ {
+ StreamReader reader = null;
+ return NextToken(ref reader, ref json, ref index);
+ }
+
+ protected static int NextToken(ref StreamReader reader, ref char[] json, ref int index)
+ {
+ bool resetIndex = false;
+ return NextToken(ref reader, ref json, ref index, ref resetIndex);
+ }
+
+ protected static int NextToken(ref StreamReader reader, ref char[] json, ref int index, ref bool resetIndex)
+ {
+ // only relevant for streams; indicates if a new line was read.
+ resetIndex = false;
EatWhitespace(json, ref index);

if (index == json.Length) {
- return JSON.TOKEN_NONE;
+ if (reader == null || reader.Peek() == -1) {
+ return JSON.TOKEN_NONE;
+ }
+ else {
+ json = reader.ReadLine().ToCharArray();
+ resetIndex = true;
+ index = 0;
+ EatWhitespace(json, ref index);
+ }
}

char c = json[index];
@@ -374,132 +462,132 @@
return JSON.TOKEN_NONE;
}

- protected static bool SerializeValue(object value, StringBuilder builder)
- {
- bool success = true;
-
- if (value is string) {
- success = SerializeString((string)value, builder);
- } else if (value is Hashtable) {
- success = SerializeObject((Hashtable)value, builder);
- } else if (value is ArrayList) {
- success = SerializeArray((ArrayList)value, builder);
- } else if (IsNumeric(value)) {
- success = SerializeNumber(Convert.ToDouble(value), builder);
- } else if ((value is Boolean) && ((Boolean)value == true)) {
- builder.Append("true");
- } else if ((value is Boolean) && ((Boolean)value == false)) {
- builder.Append("false");
- } else if (value == null) {
- builder.Append("null");
- } else {
- success = false;
- }
- return success;
- }
-
- protected static bool SerializeObject(Hashtable anObject, StringBuilder builder)
- {
- builder.Append("{");
-
- IDictionaryEnumerator e = anObject.GetEnumerator();
- bool first = true;
- while (e.MoveNext()) {
- string key = e.Key.ToString();
- object value = e.Value;
-
- if (!first) {
- builder.Append(", ");
- }
-
- SerializeString(key, builder);
- builder.Append(":");
- if (!SerializeValue(value, builder)) {
- return false;
- }
-
- first = false;
- }
-
- builder.Append("}");
- return true;
- }
-
- protected static bool SerializeArray(ArrayList anArray, StringBuilder builder)
- {
- builder.Append("[");
-
- bool first = true;
- for (int i = 0; i < anArray.Count; i++) {
- object value = anArray;
-
- if (!first) {
- builder.Append(", ");
- }
-
- if (!SerializeValue(value, builder)) {
- return false;
- }
-
- first = false;
- }
-
- builder.Append("]");
- return true;
- }
-
- protected static bool SerializeString(string aString, StringBuilder builder)
- {
- builder.Append("\"");
-
- char[] charArray = aString.ToCharArray();
- for (int i = 0; i < charArray.Length; i++) {
- char c = charArray;
- if (c == '"') {
- builder.Append("\\\"");
- } else if (c == '\\') {
- builder.Append("\\\\");
- } else if (c == '\b') {
- builder.Append("\\b");
- } else if (c == '\f') {
- builder.Append("\\f");
- } else if (c == '\n') {
- builder.Append("\\n");
- } else if (c == '\r') {
- builder.Append("\\r");
- } else if (c == '\t') {
- builder.Append("\\t");
- } else {
- int codepoint = Convert.ToInt32(c);
- if ((codepoint >= 32) && (codepoint <= 126)) {
- builder.Append(c);
- } else {
- builder.Append("\\u" + Convert.ToString(codepoint, 16).PadLeft(4, '0'));
- }
- }
- }
-
- builder.Append("\"");
- return true;
- }
-
- protected static bool SerializeNumber(double number, StringBuilder builder)
- {
- builder.Append(Convert.ToString(number, CultureInfo.InvariantCulture));
- return true;
- }
-
- /// <summary>
- /// Determines if a given object is numeric in any way
- /// (can be integer, double, null, etc).
- ///
- /// Thanks to mtighe for pointing out Double.TryParse to me.
- /// </summary>
- protected static bool IsNumeric(object o)
- {
- double result;
-
- return (o == null) ? false : Double.TryParse(o.ToString(), out result);
- }
- }
+ protected static bool SerializeValue(object value, StringBuilder builder)
+ {
+ bool success = true;
+
+ if (value is string) {
+ success = SerializeString((string)value, builder);
+ } else if (value is Hashtable) {
+ success = SerializeObject((Hashtable)value, builder);
+ } else if (value is ArrayList) {
+ success = SerializeArray((ArrayList)value, builder);
+ } else if (IsNumeric(value)) {
+ success = SerializeNumber(Convert.ToDouble(value), builder);
+ } else if ((value is Boolean) && ((Boolean)value == true)) {
+ builder.Append("true");
+ } else if ((value is Boolean) && ((Boolean)value == false)) {
+ builder.Append("false");
+ } else if (value == null) {
+ builder.Append("null");
+ } else {
+ success = false;
+ }
+ return success;
+ }
+
+ protected static bool SerializeObject(Hashtable anObject, StringBuilder builder)
+ {
+ builder.Append("{");
+
+ IDictionaryEnumerator e = anObject.GetEnumerator();
+ bool first = true;
+ while (e.MoveNext()) {
+ string key = e.Key.ToString();
+ object value = e.Value;
+
+ if (!first) {
+ builder.Append(", ");
+ }
+
+ SerializeString(key, builder);
+ builder.Append(":");
+ if (!SerializeValue(value, builder)) {
+ return false;
+ }
+
+ first = false;
+ }
+
+ builder.Append("}");
+ return true;
+ }
+
+ protected static bool SerializeArray(ArrayList anArray, StringBuilder builder)
+ {
+ builder.Append("[");
+
+ bool first = true;
+ for (int i = 0; i < anArray.Count; i++) {
+ object value = anArray;
+
+ if (!first) {
+ builder.Append(", ");
+ }
+
+ if (!SerializeValue(value, builder)) {
+ return false;
+ }
+
+ first = false;
+ }
+
+ builder.Append("]");
+ return true;
+ }
+
+ protected static bool SerializeString(string aString, StringBuilder builder)
+ {
+ builder.Append("\"");
+
+ char[] charArray = aString.ToCharArray();
+ for (int i = 0; i < charArray.Length; i++) {
+ char c = charArray;
+ if (c == '"') {
+ builder.Append("\\\"");
+ } else if (c == '\\') {
+ builder.Append("\\\\");
+ } else if (c == '\b') {
+ builder.Append("\\b");
+ } else if (c == '\f') {
+ builder.Append("\\f");
+ } else if (c == '\n') {
+ builder.Append("\\n");
+ } else if (c == '\r') {
+ builder.Append("\\r");
+ } else if (c == '\t') {
+ builder.Append("\\t");
+ } else {
+ int codepoint = Convert.ToInt32(c);
+ if ((codepoint >= 32) && (codepoint <= 126)) {
+ builder.Append(c);
+ } else {
+ builder.Append("\\u" + Convert.ToString(codepoint, 16).PadLeft(4, '0'));
+ }
+ }
+ }
+
+ builder.Append("\"");
+ return true;
+ }
+
+ protected static bool SerializeNumber(double number, StringBuilder builder)
+ {
+ builder.Append(Convert.ToString(number, CultureInfo.InvariantCulture));
+ return true;
+ }
+
+ /// <summary>
+ /// Determines if a given object is numeric in any way
+ /// (can be integer, double, null, etc).
+ ///
+ /// Thanks to mtighe for pointing out Double.TryParse to me.
+ /// </summary>
+ protected static bool IsNumeric(object o)
+ {
+ double result;
+
+ return (o == null) ? false : Double.TryParse(o.ToString(), out result);
+ }
+ }
}
Archer
Placed on: 12-17-2010 10:15
Thanks very much!!! It helps a lot!!!
philipp
Placed on: 12-19-2010 21:15
I used your library for some time now but I found three possible improvements. Smile

In the code there are two cases with something like:
char[] array = new char[len];
Array.Copy(json, index, array, 0, len);
string s = new string(array);
Instead of copying chars into a new array the following could be used:
string s = new string(json, index, len);
This is shorter and maybe even improves performance in some cases. Smile

When using JsonDecode() with one of the following illegal JSON an exception is thrown:
1e
or
"\ugggg"
The code could use TryParse() instead of Parse() to prevent this.

For me the following line is not really needed to compile:
using System.Data;
But in some environments (e.g. Mono) it needs to be removed when not "linking" to it.

Thank you for your time. Smile
John Petersen
Placed on: 12-23-2010 16:01
A very cool piece of code. I like the simplicity of it. One thing I've noticed I hope you can shed some light on. It seems that in some cases, the order of elements gets re-arranged. Example, parse this json:

{ "id" : "0001", "type" : "donut", "name" : "Cake", "ppu" : 0.55, "batters" : { "batter" : [{ "id" : "1001", "type" : "Regular" }, { "id" : "1002", "type" : "Chocolate" }, { "id" : "1003", "type" : "Blueberry" }, { "id" : "1004", "type" : "Devil's Food" }] }, "topping" : [{ "id" : "5001", "type" : "None" }, { "id" : "5002", "type" : "Glazed" }, { "id" : "5005", "type" : "Sugar" }, { "id" : "5007", "type" : "Powdered Sugar" }, { "id" : "5006", "type" : "Chocolate with Sprinkles" }, { "id" : "5003", "type" : "Chocolate" }, { "id" : "5004", "type" : "Maple" }] }

You will see the ppu element, in the resulting hashtable, no appears as the second element off the root. What was originally the second element (type: donut) - is now the last element off the root. Structurally, I don't think this alters the document. After all, relative position with key value pairs does not matter. The hierarchy is respected - which is respected. The embedded arraylist and hashtables are fine.

I've looked at the code and I suspect there is some side effect with the recursion.

Curious as to your thoughts as to what is going on. Thanks..
philipp
Placed on: 12-24-2010 12:01
@ John Petersen
This behavior is valid. JSON object are unordered sets of name/value pairs by definition (see json.org) and elements in hashtables are not sorted by insertion order (see MSDN). Therefore elements may be re-arranged when converting from one representation into the other.
Array(List)s may be used if the data has to be sorted. Smile
pwg17
Placed on: 01-30-2011 10:35
is so cool! thank's
garfix
Placed on: 02-07-2011 11:14
Patrick van Bergen
User icon
to be continuum
@philipp

Sorry it took so long for me to reply to you. Thanks for your improvements! I added all of them (2 speed improvements, 2 uncaught exceptions in case of invalid input, 1 unnecessary 'using' statement).

I like the TryParse function. That really helps. Smile
klinkenbecker
Placed on: 02-19-2011 22:22
The c# JSON parser indicated is incomplete.

Specifically, an array is a valid value within an array and thus the ParseArray function needs to recursively parse array values.

Here is a possible (simplified) fix:

protected static ArrayList ParseArray(char[] json, ref int index, ref bool success)
{
ArrayList array = new ArrayList();

// puts any parse error null into the array
// this improves error location resolution

// [
NextToken(json, ref index);

do
switch (LookAhead(json, index))
{
case JSON.TOKEN_NONE:
success = false;
return array;
case JSON.TOKEN_COMMA:
NextToken(json, ref index);
continue;

case JSON.TOKEN_OPEN_SQUARED:
array.Add(ParseArray(json, ref index, ref success));
continue;

case JSON.TOKEN_CLOSE_SQUARED:
NextToken(json, ref index);
return array;

default:
array.Add(ParseValue(json, ref index, ref success));
continue;
}
while (success);

return array;
}
Question
Placed on: 02-22-2011 17:32
I'm trying to use the JSONDecode method to output to an arraylist but I'm not sure how to interpret the error it's encountering:

string myStr = "";

ArrayList al = new ArrayList();
al = (ArrayList)JSON.JsonDecode(json);

DataTable arraydt = new DataTable();

for (int i = 0; i < al.Count; i++)
{
myStr = al.ToString() + "\n";
}

return myStr;

ERROR: "Unable to cast object of type 'System.Collections.Hashtable' to type 'System.Collections.ArrayList'."

Does the JSONDecode function cast to a Hashtable by default? Help on this would be greatly appreciated by my novice c sharp self.
garfix
Placed on: 02-22-2011 19:44
Patrick van Bergen
User icon
to be continuum
Hi Question,

JSONDecode returns a Hashtable if the json string represents an object; and it returns an ArrayList if it represents a list. Can you show me the input string, or part of it if it's too big?
ray
Placed on: 03-04-2011 00:05
Thanks a lot for this! Made one minor adjustment though.

Instead of only accepting ArrayList for encoding, I replaced ArrayList with the IList interface. This way arrays and (generic) lists are encoded too.
garfix
Placed on: 03-08-2011 13:01
Patrick van Bergen
User icon
to be continuum
@klinkenbecker You are right about the recursive nature of JSON strings. Strange that you found that the code doesn't handle it right. It looks to me that it does.

Can you give me an example of a nested array that isn't parsed right?
Mark
Placed on: 03-29-2011 19:54
It converted the object to JSON code well enough, but when I tried to convert the string back to the object, it returned a NULL. I tried following it though the code, and it decided to return a null on the first colon. Not only that, but the first character of the first keyname got sliced off.
Alex
Placed on: 04-11-2011 22:09
I've used your JSON-lib in my opensource project (keeping the copyright) hope its fine: http://facebooklogin.codeplex.com/ Thanks!!!
garfix
Placed on: 04-12-2011 09:20
Patrick van Bergen
User icon
to be continuum
Best of luck to your project, Alex!
1 2 3 4 5 6 Last page

Log in to comment on news articles.

Procurios zoekt PHP webdevelopers. Werk aan het Procurios Webplatform en klantprojecten! Zie http://www.slimmerwerkenbijprocurios.nl/.


Hello!

We are employees at Procurios, a full-service webdevelopment company located in the Netherlands. We are experts at building portals, websites, intranets and extranets, based on an in-house developed framework. You can find out more about Procurios and our products, might you be interested.

This weblog is built and maintained by us. We love to share our ideas, thoughts and interests with you through our weblog. If you want to contact us, please feel free to use the contact form!


Showcase

  • Klantcase: Bestseller
  • Klantcase: de ChristenUnie
  • Klantcase: Evangelische Omroep
  • Klantcase: de Keurslager
  • Klantcase: New York Pizza
  • Klantcase: Verhage

Snelkoppelingen