Library tutorials & articles

Strings in .NET and C#

Interning, Literals and the Debugger

Interning

.NET has the concept of an "intern pool". It's basically just a set of strings, but it makes sure that every time you reference the same string literal, you get a reference to the same string. This is probably language-dependent, but it's certainly true in C# and VB.NET, and I'd be very surprised to see a language it didn't hold for, as IL makes it very easy to do (probably easier than failing to intern literals). As well as literals being automatically interned, you can intern strings manually with the Intern method, and check whether or not there is already an interned string with the same character sequence in the pool using the IsInterned method. This somewhat unintuitively returns a string rather than a boolean - if an equal string is in the pool, a reference to that string is returned. Otherwise, null is returned. Likewise, the Intern method returns a reference to an interned string - either the string you passed in if was already in the pool, or a newly created interned string, or an equal string which was already in the pool.

Literals

Literals are how you hard-code strings into C# programs. There are two types of string literals in C# - regular string literals and verbatim string literals. Regular string literals are similar to those in many other languages such as Java and C - they start and end with ", and various characters (in particular, " itself, \, and carriage return (CR) and line feed (LF)) need to be "escaped" to be represented in the string. Verbatim string literals allow pretty much anything within them, and end at the first " which isn't doubled. Even carriage returns and line feeds can appear in the literal! To obtain a " within the string itself, you need to write "". Verbatim string literals are distinguished by having an @ before the opening quote. Here are some examples of the two types of literal, and what they amount to:

Regular literal Verbatim literal Resulting string
"Hello" @"Hello" Hello
"Backslash: \\" @"Backslash: \" Backslash: \
"Quote: \"" @"Quote: """ Quote: "
"CRLF:\r\nPost CRLF" @"CRLF:
Post CRLF"
CRLF:
Post CRLF

For other escape sequences, please see the relevant FAQ entry. Note that the difference is only for the compiler's sake. Once the string is in the compiled code, there's no such thing as a verbatim string literal vs a regular string literal.

Strings and the debugger

Numerous people run into problems when inspecting strings in the debugger, both with VS.NET 2002 and VS.NET 2003. Ironically, the problems are often generated by the debugger trying to be helpful, and either displaying the string as a regular string literal with backslash-escaped characters in, or displaying it as a verbatim string literal complete with leading @. This leads to many questions asking how the @ can be removed, despite the fact that it's not really there in the first place - it's only how the debugger's showing it. Also, some versions of VS.NET will stop displaying the contents of the string at the first null character, and evaluate its Length property incorrectly, calculating the value itself instead of asking the managed code. Again, it then considers the string to finish at the first null character.

Given the confusion this has caused, I believe it's best to examine strings in a different way when debugging, at least if you think something odd is going on. I suggest using a method like the one below, which will print the contents of a string to the console in a safe way. Depending on what kind of application you're developing, you may want to write this information to a log file, to the debug or trace listeners, or pop it up in a message box.

static readonly string[] LowNames =
{
    "NUL", "SOH", "STX", "ETX", "EOT", "ENQ", "ACK", "BEL",
    "BS", "HT", "LF", "VT", "FF", "CR", "SO", "SI",
    "DLE", "DC1", "DC2", "DC3", "DC4", "NAK", "SYN", "ETB",
    "CAN", "EM", "SUB", "ESC", "FS", "GS", "RS", "US"
};
public static void DisplayString (string text)
{
    Console.WriteLine ("String length: {0}", text.Length);
    foreach (char c in text)
    {
        if (c < 32)
        {
            Console.WriteLine ("<{0}> U+{1:x4}", LowNames[c], (int)c);
        }
        else if (c > 127)
        {
            Console.WriteLine ("(Possibly non-printable) U+{0:x4}", (int)c);
        }
        else
        {
            Console.WriteLine ("{0} U+{1:x4}", c, (int)c);
        }
    }
}

Comments

  1. 17 Jul 2009 at 08:07

    The C# decimal keyword denotes a 128-bit data type. Compared to floating-point types, the decimal type has a greater precision and a smaller range, which makes it suitable for financial and monetary calculations. http://www.deutsches-keno.de

  2. 17 Jul 2009 at 08:06

    I have someString1, but it is read from a file. I want it to appear as someString2 after calling some method. how to win roulette

  3. 05 Jan 2006 at 21:12

    Quote:
    [1]Posted by eliassal on 10 Nov 2005 06:17 AM[/1]
    Hi, i read 2 of your articles they are interesting. However, I can not figure out what or where the contents come from for the variable
    readonly string[] LowNames and how it is used. I would appreciate a short description.


    Do you mean you don't know how the array is populated, or you don't know why I populated it with the names I did?


    To answer the first question - if you look at the code, you'll see there's a static field initializer:


    static readonly string[] LowNames =
    {
       "NUL", "SOH", "STX", "ETX", "EOT", "ENQ", "ACK", "BEL",
       "BS", "HT", "LF", "VT", "FF", "CR", "SO", "SI",
       "DLE", "DC1", "DC2", "DC3", "DC4", "NAK", "SYN", "ETB",
       "CAN", "EM", "SUB", "ESC", "FS", "GS", "RS", "US"
    };


    The names come from the man page for ASCII on a unix box


    Jon

  4. 05 Jan 2006 at 21:11

    Quote:
    [1]Posted by av_rocksu on 11 Sep 2005 06:00 AM[/1]
    I'm working on project which converts differnt forms of temperature like Celcius, Farenheit and kelvin etc..
    I also have to put up access keys to it for each conversion. Please help me with conversion function and access keys. I also have to put up access keys for reset as well as exit buttons on the form. What is is double type variable?


    I'm not at all sure what this has to do with Strings, but the C# type for double precision binary floating point values is "double".


    See my article on floating point arithmetic for more information. It's at http://www.pobox.com/~skeet/csharp/floatingpoint.html
    (Sorry, the "insert link" button doesn't seem to work in Firefox.)


    Jon

  5. 10 Nov 2005 at 06:17

    Hi, i read 2 of your articles they are interesting. However, I can not figure out what or where the contents come from for the variable
    readonly string[] LowNames and how it is used. I would appreciate a short description.

  6. 11 Sep 2005 at 06:00

    I'm working on project which converts differnt forms of temperature like Celcius, Farenheit and kelvin etc..
    I also have to put up access keys to it for each conversion. Please help me with conversion function and access keys. I also have to put up access keys for reset as well as exit buttons on the form. What is is double type variable?

  7. 01 Jan 1999 at 00:00

    This thread is for discussions of Strings in .NET and C#.

Leave a comment

Sign in or Join us (it's free).

Jon Skeet C# MVP currently living in Reading and working for Google.
AddThis

Related podcasts

  • A Practical Look at Silverlight 2 Part 1

    Now that Silverlight 2 is at the Olympics and making a big splash, we wanted to explore this fascinating technology more. Microsoft Silverlight 2 is a cross-browser, cross-platform, and cross-device plug-in for delivering the next generation of .NET based media experiences and rich interactive ap...

Events coming up

  • Nov 18

    15 Minutes of Fame

    Dresher, United States

    This is a yearly tradition. We select 10 of the favorite speakers from monthly meetings, code camps, and hands on labs. Each one does a 15 minute talk on their favorite .NET technology. This is our 10th anniversary so we plan a gala event with special prizes and refreshments.

Want to stay in touch with what's going on? Follow us on twitter!