Tuesday, January 20, 2015

Validate all input parameters in one line/Check several objects for empty or null in a single method call.



Often my methods start with several guarding clauses. That is, conditional if statements that check for null or empty parameters and immediately return if they are. This is also known as defensive programming. Usually I am not concerned with this code; indeed I identify these blocks as validation code and overlook this code entirely. It was not until recently that I noticed this as an area that I was repeating myself and could be put in a reusable function.

Lets look at the code:

/// <summary>
/// Checks the parameters for empty, nulls, or invalid states.
/// </summary>
/// <returns>True if the params are null, empty, contains an array or object that is null or empty, contains a blank, whitespace, null or empty string, or contains DataTable that does not pass a call to IsValidDatatable().</returns>
public static bool ContainsNullOrEmpty(params object[] Items)
{
    if (Items == null || Items.Length < 1)
        return true;
    
    foreach (object item in Items)
    {
        if (item == null)
            return true;
        
        if (item is string)
        {
            if (string.IsNullOrWhiteSpace(item as String))
                return true;
        }
        else if (item is DataTable)
        {
            if (!IsValidDatatable(item as DataTable))
                return true;
        }
        
        if (item.GetType().IsArray)
        {
            bool isEmpty = true;
            foreach (object itm in (Array)item)
            {
                if (ContainsEmptyOrNulls(itm))
                    return true;
                
                isEmpty = false;
            }
            if (isEmpty)
                return true;
        }
    }

    return false;
}


My approach above uses the params keyword. By using params, I can pass in any number of parameters (including zero, although that doesn't help us in this context). By using an object instead of a generic type I can pass multiple different types in one method call. If I used generics, I would have to have one method call for each type that I wanted to validate.

The idea is to cram all of your common validation logic into this method, and call it everywhere to increase the readability of your business logic by not cluttering it up with validation logic. Notice how this function also checks for empty or white-space strings, as well as calling a custom IsValidDatatable(DataTable) function. If you have several functions that return an int of -1 or 0 upon failure, you might want to add another conditional to check if (item is int) and then if the value of the integer reflects an erroneous state.

Another cool feature it that if the item is an an array, or even an array of nested arrays, it will still check every item in those arrays. Notice how I get the item type, then check the Type.IsArray property boolean. If it is true, I cast the object as an System.Array, then call ContainsEmptyOrNulls() recursively in a foreach loop. We can return true right away on the first null condition met, but we must be careful not to return on a false condition and to instead let the false conditions fall through and continue on, in the case that there is a null later or in another array.

Enjoy!

Tuesday, January 13, 2015

EntityJustWorks - C# class/object/entity to SQL database/script/command mapper.



Note by author:

   Since writing this, I have expanded on this idea quite a bit. I have written a lightweight ORM class library that I call EntityJustWorks.

   The full project can be found on 
GitHub or CodePlex.


   EntityJustWorks not only goes from a class to DataTable (below), but also provides:


Security Warning:
This library generates dynamic SQL, and has functions that generate SQL and then immediately executes it. While it its true that all strings funnel through the function Helper.EscapeSingleQuotes, this can be defeated in various ways and only parameterized SQL should be considered SAFE. If you have no need for them, I recommend stripping semicolons ; and dashes --. Also there are some Unicode characters that can be interpreted as a single quote or may be converted to one when changing encodings. Additionally, there are Unicode characters that can crash .NET code, but mainly controls (think TextBox). You almost certainly should impose a white list:
string clean = new string(dirty.Where(c => "abcdefghijklmnopqrstuvwxyz0123456789.,\"_ !@".Contains(c)).ToArray());

PLEASE USE the SQLScript.StoredProcedure and DatabaseQuery.StoredProcedure classes to generate SQL for you, as the scripts it produces is parameterized. All of the functions can be altered to generate parameterized instead of sanitized scripts. Ever since people have started using this, I have been maintaining backwards compatibility. However, I may break this in the future, as I do not wish to teach one who is learning dangerous/bad habits. This project is a few years old, and its already showing its age. What is probably needed here is a total re-write, deprecating this version while keep it available for legacy users after slapping big warnings all over the place. This project was designed to generate the SQL scripts for standing up a database for a project, using only MY input as data. This project was never designed to process a USER'S input.! Even if the data isn't coming from an adversary, client/user/manually entered data is notoriously inconsistent. Please do not use this code on any input that did not come from you, without first implementing parameterization. Again, please see the SQLScript.StoredProcedure class for inspiration on how to do that.





    In this post, I bring the wagons 'round full circle and use what I have shown you in older posts to make a class that can create (or insert into) a table in an SQL database from a C# class object at run-time. The only prior knowledge needed would be the connection string. The name of the the table will be the name of the class Type, and the tables's columns are created from the class's public properties, with their names being the property names.

    If you want to take a data-first approach, you can generate the code to C# classes that mirrors a table in a relational database given a connection string and the table name, or use the new Emit class and generate a dynamic assembly at run-time that contains a class with public auto-properties matching the DataColumns that you can use to store DataRows in. Similarly, the name of class name will be the same name as the table, with each of the table's columns being represented by a public auto property on the class of the same name and type. An improvement here would be to create the class inside a dynamic assembly using System.Reflection.Emit. Anywho, I have been holding on to this code for too long I just want to get it out of the door.

    For those of you who just want the code:
      GitHub  (Download zip)
      BitBucket (Download)
      code.msdn.microsoft.com

    Going from a query to a DataTable is as simple as a call to SQLAdapter.Fill(). For populating the public properties of a class from a DataTable, see this post I made. To build a DataTable who's columns match the public properties of a class, check out this post here. Going from a DataTable to a class as a C# code file is covered in this previous post.

    Going from a DataTable to SQL CREATE TABLE or INSERT INTO scripts that can be passed to ExecuteNonQuery() to execute the command, see below.

    Here is the INSERT INTO code. This code takes advantage of some functions in the Helper class that converts a DataTable's ColumnNames and Row's Values as a List of String, which is then passed into a function that takes a list of string and formats it with a delimiter string and prefix and postfix strings so I can easily format the list to look like the COLUMNS and VALUES part of an SQL INSERT INTO statement. For example: INSERT INTO ([FullName], [Gender], [BirthDate]) VALUES ('John Doe', 'M', '10/3/1981''). If you want to view how the functions RowToColumnString(), RowToValueString(), and ListToDelimitedString() work, click here to view Helper.cs.

public static string InsertInto<T>(params T[] ClassObjects) where T : class
{
   DataTable table = Map.ClassToDatatable<T>(ClassObjects);
   return InsertInto(table);   // We don't need to check IsValidDatatable() because InsertInto does
}

public static string InsertInto(DataTable Table)
{
   if (!Helper.IsValidDatatable(Table))
      return string.Empty;

   StringBuilder result = new StringBuilder();
   foreach (DataRow row in Table.Rows)
   {
      if (row == null || row.Table.Columns.Count < 1 || row.ItemArray.Length < 1)
         return string.Empty;

      string columns = Helper.RowToColumnString(row);
      string values = Helper.RowToValueString(row);

      if (string.IsNullOrWhiteSpace(columns) || string.IsNullOrWhiteSpace(values))
         return string.Empty;

      result.AppendFormat("INSERT INTO [{0}] {1} VALUES {2}", Table.TableName, columns, values);
   }

   return result.ToString();
}

    Here is the CREATE TABLE CODE. A very important function here is the GetDataTypeString() member, which converts the Column.DataType to the SQL equivalent as a string, which is used to craft a CREATE TABLE script. I use static strings formatted to work with String.Format to replace the crucial bits in the CREATE TABLE command:

public static string CreateTable<T>(params T[] ClassObjects) where T : class
{
 DataTable table = Map.ClassToDatatable<T>(ClassObjects);
 return Script.CreateTable(table);
}

public static string CreateTable(DataTable Table)
{
   if (Helper.IsValidDatatable(Table, IgnoreRows: true))
      return string.Empty;

   StringBuilder result = new StringBuilder();
   result.AppendFormat("CREATE TABLE [{0}] ({1}", Table.TableName, Environment.NewLine);

   bool FirstTime = true;
   foreach (DataColumn column in Table.Columns.OfType<DataColumn>())
   {
      if (FirstTime) FirstTime = false;
      else
         result.Append(",");

      result.AppendFormat("[{0}] {1} {2}NULL{3}",
         column.ColumnName,
         GetDataTypeString(column.DataType),
         column.AllowDBNull ? "" : "NOT ",
         Environment.NewLine
      );
   }
   result.AppendFormat(") ON [PRIMARY]{0}GO", Environment.NewLine);

   return result.ToString();
}

private static string GetDataTypeString(Type DataType)
{
   switch (DataType.Name)
   {
      case "Boolean": return "[bit]";
      case "Char": return "[char]";
      case "SByte": return "[tinyint]";
      case "Int16": return "[smallint]";
      case "Int32": return "[int]";
      case "Int64": return "[bigint]";
      case "Byte": return "[tinyint] UNSIGNED";
      case "UInt16": return "[smallint] UNSIGNED";
      case "UInt32": return "[int] UNSIGNED";
      case "UInt64": return "[bigint] UNSIGNED";
      case "Single": return "[float]";
      case "Double": return "[double]";
      case "Decimal": return "[decimal]";
      case "DateTime": return "[datetime]";
      case "Guid": return "[uniqueidentifier]";
      case "Object": return "[variant]";
      case "String": return "[nvarchar](250)";
      default: return "[nvarchar](MAX)";
   }
}


...


    Here is definition of all the public members:

namespace EntityJustWorks.SQL
{
   public static class Convert
   {
      public static string SQLTableToCSharp(string ConnectionString, string TableName);
      public static bool ClassToSQLTable(string ConnectionString, params T[] ClassCollection);
   }
   
   public static class Code
   {
      public static string SQLTableToCSharp(string ConnectionString, string TableName);
      public static string DatatableToCSharp(DataTable Table);
   }

   public static class Script
   {
      public static string InsertInto<T>(params T[] ClassObjects);
      public static string InsertInto(DataTable Table);
      public static string CreateTable<T>(params T[] ClassObjects);
      public static string CreateTable(DataTable Table);
   }
   
   public static class Map
   {
      public static IList<T> DatatableToClass<T>(DataTable Table);
      public static DataTable ClassToDatatable<T>(params T[] ClassCollection);
      public static DataTable ClassToDatatable<T>();
   }

   public static class Query
   {
      public static IList<T> QueryToClass<T>(string ConnectionString, string FormatString_Query,
                                                            params object[] FormatString_Parameters);
      public static DataTable QueryToDataTable(string ConnectionString, string FormatString_Query,
                                                            params object[] FormatString_Parameters);
      public static T QueryToScalarType<T>(string ConnectionString, string FormatString_Query,
                                                            params object[] FormatString_Parameters);
      public static int ExecuteNonQuery(string ConnectionString, string FormatString_Command,
                                                            params object[] FormatString_Parameters);
   }
   
   public static class Helper
   {
      public static bool IsValidDatatable(DataTable Table, bool IgnoreRows = false);
      public static bool IsCollectionEmpty<T>(IEnumerable<T> Input);
      public static bool IsNullableType(Type Input);
   }
}

    Please feel free to comment, either on my GitHub or my blog, with criticisms or suggestions.

Wednesday, December 10, 2014

Humorous and unhelpful Exception Class hierarchy



This post covers some of the lesser-known exception classes that exist. It is okay if you have never heard of these exception classes before... in fact, if you haven’t, that is probably a very, very good thing...





WTFException
The exception that is thrown when the code has encountered a genuine ‘WTF’ moment. This is usually only reserved for areas of the code that should never be reached, or indicates a state/condition that is either invalid, unexpected, or simply illogical. It can also be thrown to indicate that the developer has no idea why something is not working, and is at his wits end, or optionally at the end of some code that the developer was absolutely certain would never work.

See Also: CodeUnreachableException, ThisShouldNeverHappenException, and SpaceTimeException.


class WTFException : Exception
{
    public WTFException()
        : base() { }
    
    public WTFException(string message, params object[] messageParams)
        : base(string.Format(message, messageParams)) { }
    
    public WTFException(string message, Exception innerException)
        : base(message, innerException) { }
}




SpaceTimeException
The exception that is thrown when there is a serious problem with the fabric of space/time. Typically, the error message property will be empty or missing, as the actual even that caused this exception to be thrown has not yet occurred in the same timeline that your applications exists in. No attempt should be made to retrieve the InnerException object, as a black hole could result that would be almost certain to annihilate your code base, or possibly all of humanity. The only thing we can be certain of, however, is that your boss will be very cross with you and you will most likely get fired.


class SpaceTimeException : Exception
{
    public SpaceTimeException()
        : base() { }

    public SpaceTimeException(string message, Exception innerException)
        : base(message, innerException) { }
}




ArgumentInvalidException
You argument is invalid. You are wrong, your reasoning makes no logical sense, and you probably failed debate class in school. Even if your argument has some semblance of being based in logic or reality, you cannot argue with the compiler, there is no point, so quit trying.


class ArgumentInvalidException : Exception
{
    public ArgumentInvalidException()
        : base() { }

    public ArgumentInvalidException(string message, Exception innerException)
        : base(message, innerException) { }   
}





CodeUnreachableExeception

This exception is thrown when the assembly executes a block of code that is unreachable. This exception should never get thrown, and is, by definition, impossible. Catching this exception is like catching a leprechaun, and should it ever happen, the CLR will ask you for three wishes.


class CodeUnreachableExeception : Exception
{
    public CodeUnreachableExeception()
        : base() { }

    public CodeUnreachableExeception(string message, Exception innerException)
        : base(message, innerException) { }
}




OutOfCoffeeExecption
This exception is thrown when a developer has ran out of coffee or sufficient motivation to complete the function or feature. The exception is thrown at the exact line that the developer got up for a coffee break and never returned.


class OutOfCoffeeExecption : Exception
{
    public OutOfCoffeeExecption()
        : base() { }

    public OutOfCoffeeExecption(string message, Exception innerException)
        : base(message, innerException) { }
}




UnknownException
This exception is not thrown because a parameter or method specified is unknown, but rather that the implementing developer did not have the knowledge to implement the feature or method you just called, and threw this exception instead as quick and dirty solution.


class UnknownException : Exception
{
    public UnknownException()
        : base() { }

    public UnknownException(string message, Exception innerException)
        : base(message, innerException) { }
}




ThisIsNotTheExceptionYouAreLookingForException
What? Oh this? Oh, this was not meant for you. Now move along...


class ThisIsNotTheExceptionYouAreLookingForException : Exception
{
    /* Move along... */
}



Wednesday, December 3, 2014

Word Frequency Dictionary & Sorted Occurrence Ranking Dictionary for Generic Item Quantification and other Statistical Analyses



Taking a little inspiration from DotNetPerl's MultiMap, I created a generic class that uses a Dictionary under the hood to create a keyed collection that tracks the re-occurrences or frequency matching or duplicate items of arbitrary type. For lack of a better name, I call it a FrequencyDicionary. Instead of Key/Value pairs, the FrequencyDictionary has the notion of a Item/Frequency pair, with the key being the Item. I strayed from Key/Value because I may ultimately end up calling the Item the Value.

I invented the FrequencyDictionary while writing a pisg-like IRC log file statistics generator program. It works well for word frequency analysis. After some pre-processing/cleaning of the text log file, I parse each line by spaces to get an array of words. I then use FrequencyDictionary.AddRange to add the words to my dictionary.

The FrequencyDictionary is essentially a normal Dictionary (with T being type String in this case), however when you attempt to add a key that already exists, it simply increments the Value integer. After processing the file, you essentially have a Dictionary where the Key is a word that was found in the source text, and the Value is the number of occurrences of that word in the source text.

Taking the idea even further, I created complementary Dictionary that essentially a sorted version of FrequencyDictionary with the Key/Values reversed. I call it a RankingDictionary because the Keys represent the number of occurrences of a word, with the Values being a List of words that occurred that many times in the source text. Its called a RankingDictionary because I was using it to produce a top 10 ranking of most popular words, most popular nicks, ect.

The FrequencyDictionary has a GetRankingDictionary() method in it to make the whole thing very easy to use. Typically I don't use the FrequencyDictionary too extensively, but rather as a means to get a RankingDictionary, which I base the majority of my IRC statistics logic on. The RankingDictionary also proved very useful for finding Naked Candidates and Hidden Pairs or Triples in my Sudoku solver application that I will be releasing on Windows, Windows Phone and will be blogging about shortly. Hell, I was even thinking about releasing the source code to my Sudoku App, since its so elegant and a great example of beautiful, readable code to a complex problem.

Anyways, the code for the Frequency and Ranking Dictionary is heavily commented with XML Documentation Comments, so I'm going to go ahead and let the code speak for itself. I will add usage examples later. In fact, I will probably release the pisg-like IRC Stats Prog source code since I don't think I'm going to go any farther with it.

Limitations: I ran into problems trying to parse large text files approaching 3-4 megabytes. Besides taking up a bunch of memory, the Dictionary begins to encounter many hash collisions once the size of the collection gets large enough. This completely kills performance, and eventually most the time is spent trying to resolve collisions, so it grinds to a near halt. You might notice the constructor public FrequencyDictionary(int Capacity), where you can specify a maximum capacity for the Dictionary. A good, safe ceiling is about 10,000. A better implementation of the GetHash() method might be in order, but is not a problem I have felt like solving yet.




/// <summary>
/// A keyed collection of Item/Frequency pairs, (keyed off Item).
/// If a duplicate Item is added to the Dictionary, the Frequency for that Item is incremented.
/// </summary>
public class FrequencyDictionary<ITM>
{
  // The underlying Dictionary
  private Dictionary<ITM, int> _dictionary;
  
  /// <summary>
  /// Initializes a new instance of the FrequencyDictionary that is empty.
  /// </summary>
  public FrequencyDictionary() { _dictionary = new Dictionary<ITM, int>(); }
  
  /// <summary>
  /// Initializes a new instance of the FrequencyDictionary that has a maximum Capacity limit.
  /// </summary>
  public FrequencyDictionary(int Capacity) { _dictionary = new Dictionary<ITM, int>(); }
  
  /// <summary>
  /// Gets a collection containing the Items in the FrequencyDictionary.
  /// </summary>
  public IEnumerable<ITM> ItemCollection { get { return this._dictionary.Keys; } }
  
  /// <summary>
  /// Gets a collection containing the Frequencies in the FrequencyDictionary.
  /// </summary>
  public IEnumerable<int> FrequencyCollection { get { return this._dictionary.Values;} }
  
  /// <summary>
  /// Adds the specified Item to the FrequencyDictionary.
  /// </summary>
  public void Add(ITM Item)
  {
    if  ( this._dictionary.ContainsKey(Item))   { this._dictionary[Item]++; }
    else    { this._dictionary.Add(Item,1); }
  }
  
  /// <summary>
  /// Adds the elements of the specified array to the FrequencyDictionary.
  /// </summary>
  public void AddRange(ITM[] Items)
  {
    foreach(ITM item in Items) { this.Add(item); }
  }
  
  /// <summary>
  /// Gets the Item that occurs most frequently.
  /// </summary>
  /// <returns>A KeyValuePair containing the Item (key) and how many times it has appeard (value).</returns>
  public KeyValuePair<ITM,int> GetMostFrequent()
  {
    int maxValue = this._dictionary.Values.Max();
    return this._dictionary.Where(kvp => kvp.Value == maxValue).FirstOrDefault();
  }
  
  /// <summary>
  /// Gets the number of Item/Frequency pairs contained in the FrequencyDictionary.
  /// </summary>
  public int Count { get { return this._dictionary.Count; } }
  
  /// <summary>
  /// Returns an enumerator that iterates through the FrequencyDictionary.
  /// </summary>
  public IEnumerator<KeyValuePair<ITM,int>> GetEnumerator()
  {
    return this._dictionary.GetEnumerator();
  }
  
  /// <summary>
  /// Gets the Frequency (occurrences) associated with the specified Item.
  /// </summary>
  public int this[ITM Item]
  {
    get { if (this._dictionary.ContainsKey(Item)) { return this._dictionary[Item]; } return 0; }
  }
  
  /// <summary>
  /// Creates a RankingDictionary from the current FrequencyDictionary.
  /// </summary>
  /// <returns>A RankingDictionary of Frequency/ItemCollection pairs ordered by Frequency.</returns>
  public RankingDictionary<ITM> GetRankingDictionary()
  {
    RankingDictionary<ITM> result = new RankingDictionary<ITM>();
    foreach(KeyValuePair<ITM,int> kvp in _dictionary)
    {
      result.Add(kvp.Value,kvp.Key);
    }
    return result;
  }
  
  /// <summary>
  /// Displays usage information for FrequencyDictionary 
  /// </summary>
  public override string ToString()
  {
    return "FrequencyDictionary<Item, Frequency> : Key=\"Item=\", Value=\"Frequency\"\".";
  }
}




And now the RankingDictionary:

/// <summary>
/// A keyed collection of Frequency/ItemCollection pairs that is ordered by Frequency (rank).
/// If an Item is added that has the same Frequency as another Item, that Item is added to the Item collection for that Frequency.
/// </summary>
public class RankingDictionary<ITM>
{
  // Underlying dictionary
  SortedDictionary<int,List<ITM>> _dictionary;
  
  /// <summary>
  /// Initializes a new instance of the FrequencyDictionary that is empty.
  /// </summary>
  public RankingDictionary() { _dictionary = new SortedDictionary<int,List<ITM>>(new FrequencyComparer()); }
  
  /// <summary>
  /// The Comparer used to compare Frequencies.
  /// </summary>
  public class FrequencyComparer : IComparer<int>
  {
    public int Compare(int one,int two) { if(one == two) return 0; else if(one > two) return -1; else return 1; }
  }
  
  /// <summary>
  /// Gets a collection containing the Frequencies in the RankingDictionary.
  /// </summary>
  public IEnumerable<int> FrequencyCollection { get { return this._dictionary.Keys; } }
  
  /// <summary>
  /// Gets a collection containing the ItemCollection in the RankingDictionary.
  /// </summary>
  public IEnumerable<List<ITM>> ItemCollections   { get { return this._dictionary.Values; } }
  
  /// <summary>
  /// Adds the specified Frequency and Item to the RankingDictionary.
  /// </summary>
  public void Add(int Frequency, ITM Item)
  {
    List<ITM> itemCollection = new List<ITM>();
    itemCollection.Add(Item);
    // If the specified Key is not found, a set operation creates a new element with the specified Key
    this._dictionary[Frequency] = itemCollection;
  }
  
  /// <summary>
  ///  Gets the number of Frequency/ItemCollection pairs contained in the RankingDictionary.
  /// </summary>
  public int Count { get { return this._dictionary.Count; } }
  
  /// <summary>
  /// Returns an enumerator that iterates through the RankingDictionary.
  /// </summary>
  public IEnumerator<KeyValuePair<int,List<ITM>>> GetEnumerator()
  {
    return this._dictionary.GetEnumerator();
  }
  
  /// <summary>
  /// Gets the ItemCollection associated with the specified Frequency.
  /// </summary>
  public List<ITM> this[int Frequency]
  {
    get
    {
      List<ITM> itemCollection;
      if (this._dictionary.TryGetValue(Frequency,out itemCollection)) return itemCollection;
      else return new List<ITM>();
    }
  }
  
  /// <summary>
  /// Displays usage information for RankingDictionary 
  /// </summary>
  public override string ToString()
  {
    return "RankingDictionary<Frequency, List<Item>> : Key=\"Frequency\", Value=\"List<Item>\".";
  }
}


Usage examples will be posted later.

Tuesday, October 28, 2014

Create C# Class Code From a DataTable using CodeDOM



Note by author:

   Since writing this, I have expanded on this idea quite a bit. I have written a lightweight ORM class library that I call EntityJustWorks.

   The full project can be found on
GitHub or CodePlex.


   EntityJustWorks not only goes from a class to DataTable (below), but also provides:

Security Warning:
This library generates dynamic SQL, and has functions that generate SQL and then immediately executes it. While it its true that all strings funnel through the function Helper.EscapeSingleQuotes, this can be defeated in various ways and only parameterized SQL should be considered SAFE. If you have no need for them, I recommend stripping semicolons ; and dashes --. Also there are some Unicode characters that can be interpreted as a single quote or may be converted to one when changing encodings. Additionally, there are Unicode characters that can crash .NET code, but mainly controls (think TextBox). You almost certainly should impose a white list: string clean = new string(dirty.Where(c => "abcdefghijklmnopqrstuvwxyz0123456789.,\"_ !@".Contains(c)).ToArray()); 
PLEASE USE the SQLScript.StoredProcedure and DatabaseQuery.StoredProcedure classes to generate SQL for you, as the scripts it produces is parameterized. All of the functions can be altered to generate parameterized instead of sanitized scripts. Ever since people have started using this, I have been maintaining backwards compatibility. However, I may break this in the future, as I do not wish to teach one who is learning dangerous/bad habits. This project is a few years old, and its already showing its age. What is probably needed here is a total re-write, deprecating this version while keep it available for legacy users after slapping big warnings all over the place. This project was designed to generate the SQL scripts for standing up a database for a project, using only MY input as data. This project was never designed to process a USER'S input.! Even if the data isn't coming from an adversary, client/user/manually entered data is notoriously inconsistent. Please do not use this code on any input that did not come from you, without first implementing parameterization. Again, please see the SQLScript.StoredProcedure class for inspiration on how to do that.


So far I have posted several times on the DataTable class. I have shown how to convert a DataTable to CSV or tab-delimited file using the clipboard, how to create a DataTable from a class using reflection, as well as how to populate the public properties of a class from a DataTable using reflection. Continuing along these lines, I decided to bring the DataTable-To-Class wagons around full-circle and introduce a class that will generate the C# code for the class that is used by the DataTableToClass<T> function, so you don't have to create it manually. The only parameter required to generate the C# class code is, of course, a DataTable.

The code below is rather trivial. It uses CodeDOM to build up a class with public properties that match the names and data types of the data columns of the supplied DataTable. I really wanted the output code to use auto properties. This is not supported by CodeDOM, however, so I used a little hack or workaround to accomplish the same thing. I simply added the getter and setter code for the property to the member's field name. CodeDOM adds a semicolon to the end of the CodeMemberField statement, which would cause the code not to compile, so I added two trailing slashes "//" to the end of the field name to comment out the semicolon. The whole point of creating auto properties was to have clean, succinct code, so after I generate the source code file, I clean up the commented-out semicolons by replacing every occurrence with an empty string. The main disadvantage of this 'workaround' is that the code cannot be used to generate a working class in Visual Basic code. I do have proper CodeDOM code that does not employ this workaround, but I prefer the output code to contain auto-properties; auto-generated code is notorious for being messy and hard to read, and I did not want my generated code to feel like generated code.


Below is the DataTableToCode function, its containing class and its supporting functions. The code is short, encapsulated, clean and commented, so I will just let it speak for itself:

public static class DataTableExtensions
{
   public static string DataTableToCode(DataTable Table)
   {
      string className = Table.TableName;
      if(string.IsNullOrWhiteSpace(className))
      {   // Default name
         className = "Unnamed";
      }
      className += "TableAsClass";
      
      // Create the class
      CodeTypeDeclaration codeClass = CreateClass(className);
      
      // Add public properties
      foreach(DataColumn column in Table.Columns)
      {
         codeClass.Members.Add( CreateProperty(column.ColumnName, column.DataType) );
      }
      
      // Add Class to Namespace
      string namespaceName = "AutoGeneratedDomainModels";
      CodeNamespace codeNamespace = new CodeNamespace(namespaceName);
      codeNamespace.Types.Add(codeClass);
      
      // Generate code
      string filename = string.Format("{0}.{1}.cs",namespaceName,className);
      CreateCodeFile(filename, codeNamespace);
      
      // Return filename
      return filename;
   }
   
   static CodeTypeDeclaration CreateClass(string name)
   {
      CodeTypeDeclaration result = new CodeTypeDeclaration(name);
      result.Attributes = MemberAttributes.Public;
      result.Members.Add(CreateConstructor(name)); // Add class constructor
      return result;
   }
   
   static CodeConstructor CreateConstructor(string className)
   {
      CodeConstructor result = new CodeConstructor();
      result.Attributes = MemberAttributes.Public;
      result.Name = className;
      return result;
   }
   
   static CodeMemberField CreateProperty(string name, Type type)
   {
      // This is a little hack. Since you cant create auto properties in CodeDOM,
      //  we make the getter and setter part of the member name.
      // This leaves behind a trailing semicolon that we comment out.
      //  Later, we remove the commented out semicolons.
      string memberName = name + "\t{ get; set; }//";
      
      CodeMemberField result = new CodeMemberField(type,memberName);
      result.Attributes = MemberAttributes.Public | MemberAttributes.Final;
      return result;
   }
   
   static void CreateCodeFile(string filename, CodeNamespace codeNamespace)
   {
      // CodeGeneratorOptions so the output is clean and easy to read
      CodeGeneratorOptions codeOptions = new CodeGeneratorOptions();
      codeOptions.BlankLinesBetweenMembers = false;
      codeOptions.VerbatimOrder = true;
      codeOptions.BracingStyle = "C";
      codeOptions.IndentString = "\t";
      
      // Create the code file
      using(TextWriter textWriter = new StreamWriter(filename))
      {
         CSharpCodeProvider codeProvider = new CSharpCodeProvider();
         codeProvider.GenerateCodeFromNamespace(codeNamespace, textWriter, codeOptions);
      }
      
      // Correct our little auto-property 'hack'
      File.WriteAllText(filename, File.ReadAllText(filename).Replace("//;", ""));
   }
}


An example of the resulting code appears below:

namespace AutoGeneratedDomainModels
{
   public class CustomerTableAsClass
   {
      public CustomerTableAsClass()
      {
      }
      public string FirstName   { get; set; }
      public string LastName    { get; set; }
      public int  Age           { get; set; }
      public char Sex           { get; set; }
      public string Address     { get; set; }
      public string Birthdate   { get; set; }
   }
}

I am satisfied with the results and look of the code. The DataTableToCode() function can be a huge time saver if you have a large number of tables you need to write classes for, or if each DataTable contains a large number of columns.

If you found any of this helpful, or have any comments or suggestions, please feel free to post a comment.