Friday, April 12, 2019

C# - Miscellaneous

Multiple files

A C# program can be spread over multiple files. So far all our code has been written in one large file. Lets us create 2 .cs files a.cs and b.cs as follows

a.cs
class zzz
{
public static void Main()
{
yyy a = new yyy();
a.abc();
}
}

b.cs
public class yyy
{
public void abc()
{
System.Console.WriteLine("hi");
}
}

Earlier our code did not spawn multiple files. C# does not care whether the code is in one file or spread over multiple files. We have only to make a small change while we compile the program.

Running Csc a.cs b.cs will create a.exe

Output
Hi

These files are called source files and it is a good idea to give them a file extension of .cs. If you rename b.cs to b.xxx as we did and rerun csc as

>csc a.cs b.xxx

This will create a.exe as usual. File extensions matter to the programmer not to the compiler.

Ascii and Unicode

a.cs
class zzz
{
public static void Main()
{
System.Console.WriteLine((char)65);
System.Console.WriteLine((char)66);
System.Console.WriteLine((char)67);
}
}




Output
A
B
C

Computers in a way are pretty dumb. They do not understand letters of the alphabet. All that they can store in memory are numbers. But then, how does a computer understand or display alphabets? The WriteLine function displays 65 as 65, but the output is A. In the () brackets we have placed a data type called char. We call a ( ) a cast. It means, for the moment convert whatever follows into a char. Thus the number 65 gets converted into a char which is displayed as  a 'A'. The 66 gets displayed as a 'B'. Someone, somewhere in the world invented a rule which specified that the number 65 represents a capital A, etc. This rule is given a name and is called ASCII. All that ASCII says is that the numbers form 0 to 255 can also represent small and capital letters, punctuation etc. Whenever you write A, rest assured somewhere in memory, a 65 was stored. A file on disk can also contain numbers form 0 to 255 only and the same rule as spelt above apply.

a.cs
class zzz
{
public static void Main()
{
char i = 'a';
System.Console.WriteLine((char)i);
}
}

Output
a

C# offers us the data type char to represent ASCII values naturally.

a.cs
class zzz
{
public static void Main()
{
int i;
for ( i=0; i<=255; i++)
System.Console.Write(i + " " + (char)i + " ");
}
}

The above program displays the entire Ascii table. The problem with Ascii is that it is sufficient for a language like English, which does not have to many symbols to represent. However, when it comes to visual languages like Japanese, they have more symbols to represent than English. Ascii can represent a max of 256 unique symbols. The industry thus invented Unicode which uses 2 bytes for every character unlike Ascii's one. All the languages of the world can be represented by Unicode. C# understands Unicode and thus the char data type store characters internally as Unicode and not Ascii.

The present Unicode standard is 3.0.

a.cs
class zzz
{
public static void Main()
{
int int;
}
}

Compiler Error
a.cs(5,5): error CS1041: Identifier expected, 'int' is a keyword
a.cs(5,8): error CS1001: Identifier expected

Words like int, char, if etc are reserved by C# and we are not allowed to use them as function names, class names, variable names etc. However, if you insist on doing so, then you have to preface the name with a @ sign like below.


a.cs
class zzz
{
public static void Main()
{
int @int;
}
}

Compiler Warning
a.cs(5,5): warning CS0168: The variable 'int' is declared but never used

The warning does not name the variable @int but int.

a.cs
class zzz
{
public static void Main()
{
int @int;
@int = 10;
System.Console.WriteLine(@int);
}
}

Output
10

We have millions of names to choose for a variable, then why insist on an int. There will be times when another language declares a name as a reserved name and in those cases we would use the @ sign. It is advisable not to use the @ sign very often.

When we run the C# compiler on our program, it does 2 things. One, it reads our code and converts it into things/tokens it understands. This is called a Lexical analysis. Then it does a Syntactic analysis which gives us an executable output.
Comments 

Comments are a form of documentation. They are lines of code written for our benefit (the community of programmers) and not for C#'s. In spite of this, programmers in general are lazy in writing comments. Comments are ignored by the compiler.

a.cs
// hi this is comment
class zzz
{
public static void Main() // allowed here
{
/*
A comment over
two lines
*/
}
}

A regular comment starts with a /* and ends with a */. They can be spread over multiple lines and can be placed anywhere in your code. Any line beginning with a // is a one line comment and as the name suggests, cannot span multiple lines. A single line comment does not have to be at the beginning of a line.

Escape Sequences and Strings

a.cs
class zzz
{
public static void Main()
{
System.Console.WriteLine("hi \nBye\tNo");
System.Console.WriteLine("\\");
}
}
Output
hi
Bye      No
\

An escape sequence is anything that starts with a \. A \n means start printing from a new line and a \t means a tab. Two backslashes convert into a single backslash.

a.cs
class zzz
{
public static void Main()
{
System.Console.WriteLine(@"hi \nBye\tNo");
}
}

Output
hi \nBye\tNo

A string is anything in double quotes. A verbatim string starts with a @ sign and all the escape sequences are ignored by the C# compiler and displayed verbatim.

a.cs
class zzz
{
public static void Main()
{
System.Console.WriteLine("hi
  bye");
}
}

Compiler Error
a.cs(5,26): error CS1010: Newline in constant
a.cs(6,6): error CS1010: Newline in constant
A string cannot spawn multiple lines.

a.cs
class zzz
{
public static void Main()
{
System.Console.WriteLine(@"hi
  bye");
}
}

Output
hi
  bye

Placing an @ in front of the string lets it spawn multiple lines and the spaces shown in the output. If you want the \ to lose its special meaning in a string, preface that string with a @ sign.

a.cs
class zzz
{
public static void Main()
{
string a = "bye";
string b = "bye";
System.Console.WriteLine(a == b);
}
}

Output
True

The above example displays true, even though the two strings may be stored in different areas of memory. The two strings contain the same characters and hence are similar.

The Preprocessor

Before the C# compiler can start, a small part of it called the pre-processor first activates itself. It is called the preprocessor as the same concept existed in the programming language 'C'. All that the preprocessor does is that it looks at those lines beginning with a # symbol.

a.cs
#define vijay
class zzz
{
public static void Main()
{
}
}

The first preprocessor directive we are learning is called define. This lets us create a word/variable or even better, an identifier called vijay. The identifier vijay has no value other than true.

a.cs
class zzz
{
public static void Main()
{
#define vijay
}
}

Compiler Error
a.cs(5,2): error CS1032: Cannot define/undefine preprocessor symbols after first token in file

We cannot use the #define, after valid C# code has been written. They have to come at the beginning of the file.


a.cs
#define vijay
class zzz
{
public static void Main()
{
#if vijay
System.Console.WriteLine("1");
#endif
}
}

Output
1

As a #define creates a variable, its value can be checked by the if or more precisely the #if which works in the same way as the if of C# does. Thus the #if is true and all code up to the #endif gets added to the code.

a.cs
class zzz
{
public static void Main()
{
#if vijay
System.Console.WriteLine("1");
#else
System.Console.WriteLine("2");
#endif
}
}

Output
2

The same rules as before for an else. Here as we have not created an identifier called vijay, it gets a value of false and therefore the #if is false. Imagine a preprocessor identifier as a boolean variable.
Why use a preprocessor variable instead of a normal one?

Run the C# compiler as follows on the above program and observe the change in output.

csc /D:vijay a.cs

Output
1

The output displays 1 as the /D compiler option lets you create identifiers at the time of compiling the program. This cannot be done with a normal variable. We can add/subtract lots of code form our program, at the time of compilation.

a.cs
#undef vijay
class zzz
{
public static void Main()
{
#if vijay
System.Console.WriteLine("1");
#else
System.Console.WriteLine("2");
#endif
}
}

Output
2

As we are allowed to create an identifier vijay by the define, the undef does the reverse. It sets it to false which is the default in any case. As the value of vijay is false, the else gets activated. However we run the above as csc /D:vijay a.cs, we are first creating an identifier vijay at the command line prompt, then undefining it at the first line in the program and the output will show 2 as before. You cannot use the define or undefine after real code.

a.cs
#define vijay
#undef vijay
#undef vijay
class zzz
{
public static void Main()
{
#if vijay
System.Console.WriteLine("1");
#endif
}
}

People are allowed to nag you as many times as they like. Repetition has been part of human history since ancient times. You are allowed to undef as many times as you like even though it makes no logical sense.

a.cs
#define vijay
#define mukhi
class zzz
{
public static void Main()
{
#if vijay
#if mukhi
System.Console.WriteLine("1");
#endif
#endif
}
}

Output
1


You can have as many #if's within #if's. We call them nested if's. If the #if is true, then the text up to the #endif is included.

a.cs
#define vijay
class zzz
{
public static void Main()
{
#if vijay
System.Console.WriteLine("1");
#else
int int;
#endif
}
}

We get no error at all in spite of the fact that we are not allowed to create a variable called int. Is C# sleeping at the wheel? It is not as the preprocessor realized that the identifier vijay is true, it removed all the code from the #else to the #endif. C# did not flag an error at all, as it was not allowed to see the offending code by the preprocessor.

a.cs
class zzz
{
public static void Main()
{
#if vijay
System.Console.WriteLine("1");
#else
int int;
#endif
}
}

Compiler Error
a.cs(8,5): error CS1041: Identifier expected, 'int' is a keyword
a.cs(8,8): error CS1001: Identifier expected
Now we see the error as the identifier vijay is false. Remember what the C# compiler sees is what the preprocessor allows it to. You write code and what the compiler sees may be very very different.

a.cs
#warning We have a code red
class zzz
{
public static void Main()
{
}
}

Compiler Warning
a.cs(1,10): warning CS1030: #warning: 'We have a code red'

Whenever we want a warning message to be displayed at the time of compiling our code we use #warning.

a.cs
class zzz
{
#warning We have a code red
public static void Main()
{
}
}

Compiler Warning
a.cs(3,10): warning CS1030: #warning: 'We have a code red'

Unlike the #define, the #warning can be used anywhere in our program. It enables us to add to the messages of the compiler. Also the line number changes from 1 to 3 telling us where the warning occurred.




a.cs
class zzz
{
#error We have a code red
public static void Main()
{
}
}

Compiler Error
a.cs(3,8): error CS1029: #error: 'We have a code red'

Wherever we have warnings, errors cannot be far behind. The difference is that an error unlike a warning, stops everything in its tracks and does not let the compiler proceed ahead. No exe file is created. Normally an error or warning would be placed in an if statement as follows.

a.cs
#define vijay
#define mukhi
class zzz
{
#if vijay && mukhi
#error We have a code red
#endif
public static void Main()
{
}
}

Compiler Error
a.cs(6,8): error CS1029: #error: 'We have a code red'

The && means and. The if is true if both sides of the && is true. They are in this case. Remove one of the above #defines and the if will be false.

a.cs
#line 100 "vijay"
class zzz
{
#warning We have a code red
public static void Main()
{
}
}

Compiler Warning
vijay(102,10): warning CS1030: #warning: 'We have a code red'

The line directive does two things. One it changes the line number from 1 which is what is should be at the beginning to 100. Thus the warning appears on line 102 now and not 2. Also the file name changes from a.cs to vijay. You have total control over the line number and file name displayed.

a.cs
#line 100 "vijay"
class zzz
{
public static void Main()
{
int int;
#line 200 "mukhi"
char char;
}
}

Compiler Error
vijay(104,5): error CS1041: Identifier expected, 'int' is a keyword
vijay(104,8): error CS1001: Identifier expected
mukhi(200,6): error CS1041: Identifier expected, 'char' is a keyword
mukhi(200,10): error CS1001: Identifier expected

Line does not only work with the #error or #warning. It affects also the compiler's error line number and file name. You are allowed to have as many #lines as you prefer.

No comments:

Post a Comment

No String Argument Constructor/Factory Method to Deserialize From String Value

  In this short article, we will cover in-depth the   JsonMappingException: no String-argument constructor/factory method to deserialize fro...