Blogger news

Blogger templates

Wednesday, December 29, 2010

Linq in depth

Linq in depth
Hi,
Here i am starting a new series on Linq. The topics i will cover during this journey are,
1. Why LINQ?
2. Extention Methods
3. Delegates
4. Anonymous Methods
5. Generic delegates (Func, Action and Predicate)
6. Lamda Expressions
7. Relating Delegates, Anonymous Methods and Lamda Expression.
8. Lamda expression writing
9. IEnumerable basics
10. LINQ operators
11. LINQ to objects
12. Implementing Lamda expression in LINQ  operators.
13. Query Syntax.
14. XML Basics
15. LINQ to XML


Why LINQ

LINQ is a set of new programming semantics that allows you to unify the way of accessing any kind of data. It is very important that you understand that data does not mean database. Data can be found in collection of domain objects, XML document,  Linked list...etc. LINQ unify the way of accessing data in all sources of information you can deal with while writing your code.

for instance List<> is a collection and we can query List<> using LINQ.


Example
class Program
    {
        static void Main(string[] args)
        {
            List<string> names = new List<string> { "Ajith", "Kiran", "Sanjay", "Anoop"};
            var filteredNames = names.Where(k => k == "Ajith");
            foreach (var name in filteredNames)
            {
                Console.WriteLine(name);
            }
            Console.ReadKey();
        }
      
   
    }



The output would be "Ajith"
here using "where" operator we filtered the List. In the upcoming session i will cover each and every LINQ operators and it's implementation.
Extension Methods
Extension methods enable you to "add" methods to existing types without creating a new derived type, recompiling, or otherwise modifying the original type. Extension methods are a special kind of static method, but they are called as if they were instance methods on the extended type.
LINQ operators like where, select etc Extension Methods. We will go in depth about LINQ operators in depth in the upcoming modules.
Rules to write extension method
a.       The class to write extension method should be static.
b.       The type to be extended should decorated with “new” keyword. For example if you want to extend the type “string” then it should be,
public static class ExtensionMethod
   {
        public static void StringExtension(this string notParam)
        {
        }
   }
Here ‘notParam’ is not the method parameter.
Now can call the method “StringExtension” with any string type. For example ,
string myVar = null;
myVar.StringExtension();

"name".StringExtension();

We can’t call “StringExtension” with any other type because “StringExtension” is an extension method which extend the type string. If we extend the type “int” then we can call that extension method with “int” type only.

Extend custom class

Class is also a type so we can extend functionality of a class too. For example,

class MyClass
    {

    }
If I want to write extension method to “MyClass” then it would be,

public static class ExtensionMethod
    {
        public static void MyClassExtension(this MyClass notparam)
        {

        }
    }
You can call “MyClassExtension” by creting object of “MyClass”. For example,

MyClass obj = new MyClass();
obj.MyClassExtension();

Why Extension Methods

a.       If you want add a new method to a sealed class (sealed class is a class which cannot be inherited) then you can extend that class.
b.       If you want add new function to an defined interface then you can extend that interface. For example here I am extending IEnumerable<object>,

public static class ExtensionMethod
    {
        public static void IenumerableExtension(this IEnumerable<object> notparam)
        {

       }
    }

I can call “IenumerableExtension” with any type which inherited the interface IEnumerable<T>. For example List<T> is a class which internally inherit interface IEnumerable<T>. so we get “IenumerableExtension” with List<T> type. Example,

List<string> list = new List<string>();
list.IenumerableExtension();

LINQ operators are extension methods which extend the type IEnumerable<T>. All the collection classes internally implement IEnumerable<T> interface that’s the reason why we get LINQ operators with all collections.
Lamda Expression
Lamda expression is C# 3.0 language feautures and it’s nothing but just higher version of anonymous method. Anonymous is nothing but higher version of delegate. So before dig into lamda expression I will give you basic understanding of delegate and anonymous methods.

Delegate
Simply we can say delegate is nothing but a function pointer, means a delegate can point to any method which satisfy the delegate’s signature. For example the meaning of this delegate is,public delegate string MyDelegate(int myParam);

Using 
MyDelegate I can point to any method which accepts a parameter of int and return string. For example I can point the method,
public static string HelloWorld(int hello)
        {
            return "Hello World!!";
        }
with
MyDelegate. But I can’t point the method,

public static string Wrong()
        {
            return "I can’t point this method with MyDelegate because it is not accepting an integer parameter!!";
        }
with
MyDelegate because it will not match with MyDelegate’s signature.

I can point “HelloWorld” with MyDelegate as,




        static void Main(string[] args)
        {
            MyDelegate del = HelloWorld;
     Console.WriteLine(del(100));
            Console.ReadKey();

        }

The output would be Hello World!!". After pointing the method “HelloWorld” with MyDelegate then when ever we call “del” it will run “HelloWorld” method. The entire program look’s like,

using System;

namespace ConsoleApplication4
{
    class Program
    {
        public delegate string MyDelegate(int myParam);
        static void Main(string[] args)
        {
            MyDelegate del = HelloWorld; Console.WriteLine(del(100));
            Console.ReadKey();
        }
        public static string HelloWorld(int hello)
        {
            return "Hello World!!";
        }
    }
}

This is all about the very basics of delegate.

Anonymous Methods
Anonymous method was introduced in C# 2.0. Before the anonymous method we need a physical method which matches the delegate’s signature to point with delegate as I discussed in the previous chapters. In anonymous method we don’t need a method to point the delegate. Here I am rewriting the “MyDelegate” which now point with anonymous method.

public delegate string MyDelegate(int myParam);
        static void Main(string[] args)
        {
            MyDelegate del = delegate(int hello) { return "Hello World!!"; };
            Console.WriteLine(del(100));
            Console.ReadKey();
        }

The output would be “Hello World!!"”. Now we can operate the sample,

MyDelegate del = delegate(int hello) { return "Hello World!!"; };

Here “int hello” is the method parameter and whatever inside curly brace is method body. Say if I have a delegate which accepts two input parameters of type “int” and “string” and return “string” so we can write it as,

TestDelegate test= delegate(int a,string b) { return "Method accepts two parameters!!" };

In anonymous method we don’t need a physical method. We can point method to delegate like
Delegate(Method parameters){ method body};
This all about simple basics of anonymous methods.


Lambda Expression
Now  I am going to introduce the real beauty, Lambda Expression. Lambda expression is nothing but one step above anonymous method. It’s C# 3.0 language feature. Again I am take the same example of delegate which point to a method which takes one input parameter of type  “int” and returns a “string” type. We can implement it using  lambda expression as,

MyDelegate del = a => "Hello World!!";
Console.WriteLine(del(100));

Hats off to the new language feature, the output is again Hello World!!". The entire program looks like,


using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ConsoleApplication5
{
    class Program
    {
        public delegate string MyDelegate(int myParam);
        static void Main(string[] args)
        {
            MyDelegate del = a => "Hello World!!";
            Console.WriteLine(del(100));
            Console.ReadKey();
        }
    }
}

In the upcoming session I will got in depth and every aspects of lambda expression using many examples.


Now we can operate lambda parts by parts. See the image below,

See the line  MyDelegate del = a => "Hello World!!";
Lambda has two parts, a left part and right part. Left part is input parameter.  So here “a” is input parameter means if we write a method to  point the delegate “del” it should looks like,
public static string HelloWorld(int a)
        {
            return "Hello World!!";
        }
Here also “a” is input parameter of type “int”. In lambda expression compiler can understand “a” is the input parameter of type “ int” or we can explicitly write the type of input parameter as,
MyDelegate del = (int a) => "Hello World!!";

But compiler is powerful to know the type explicitly. Compiler can understand it by checking the signature of the delegate.
Now we can check the right side of lambda. Right hand side is nothing but the return value from the method. In our example the lambda expression returns "Hello World!!", exactly as the “HelloWorld” method. Lamda expression can understand the return type by checking the delegate signature.
Say  I want to write lambda expression for a delegate which accepts two input parameters of type “int” and return the sum of that two input parameters. How it should looks like? Before explain that I can again write a method for the same scenario,
public static int GetSum(int param1,int param2)
        {
            return param1 + param2;
        }
Now I am converting this to lambda expression. I want two input parameters so I can write it as (param1, param2) and I want to return the sum, so I can write it as param1+param2. Now we can combine this together as ,
(param1, param2)=> param1+param2;
When we point this to delegate it should looks like,
SumDelegate sum = (param1, param2) => param1 + param2;
Console.WriteLine(sum(100, 150));

Yes the output would be 250.

These types of lambda’s are called expression lambda. If we write a statement in lambda then it is called statement lambda. We will go to statement lambda’s in the upcoming session.

Generic Delagates

Next I am planning to give a brief introduction about general purpose generic delegate in C# 3.0.  They are,
Func<>, Action<>, Predicate<>.

But before that I will explain what a generic delegate means. I will begin with an example. Hope readers is aware about the basics of generics.

public delegate TResult MyGenericDelagete<in T1,in T2,out TResult>(T1 arg1,T2 arg2);

Here MyGenericDelagete is a generic delegate which accepts two input parameters and returns something. Input parameters are generic , means it can accept any type but once specified a type then it’s type safe. For example if I create type of this delegate as

MyGenericDelagete<int,int,string> mydel

Now I can point only method which accepts two input parameters of type “int” and return “string”. So the entire code looks like,

public delegate TResult MyGenericDelagete<in T1,in T2,out TResult>(T1 arg1,T2 arg2);
      
        static void Main(string[] args)
        {
            MyGenericDelagete<int, int, string> mydel = MyMethod;


            Console.WriteLine(mydel(100,200));
            Console.ReadKey();
        }
        public static string MyMethod(int a, int b)
        {
            return Convert.ToString(a + b);
        }

I am going to write lambda for the same,
public delegate TResult MyGenericDelagete<in T1,in T2,out TResult>(T1 arg1,T2 arg2);
      
        static void Main(string[] args)
        {
            MyGenericDelagete<int, int, string> mydel = (a,b)=>Convert.ToString(a+b);



            Console.WriteLine(mydel(100,200));
            Console.ReadKey();
        }
        public static string MyMethod(int a, int b)
        {
            return Convert.ToString(a + b);
        }



No wonder the output would be 300.

Func<>,Action<>,Predicate<>


Now am starting with Func<>,Action<>,Predicate<>. These are  C# general purpose generic delegate. I will explain one by one.

Func<>
Func is a generic delegate, with Func<> we can point to any method which returns something. We cannot point to method with return type “void” with Func<>. I will explain it with example

Func<int, string> myFunc

This is a generic delegate which accept one parameter of type “int” and return “string” type (last one in the Func should be the return type) so we can point any method which accepts one parameter of type “int” and returning “string” with “Func”. So I can point the method below to this Func.
public static string GetNameandAge(int age)
        {
            return "Kiran "+" " + age.ToString();
        }
I can point this method to Func as,

Func<int, string> myFunc = GetNameandAge;

The whole program looks like,

class Program
    {
        static void Main(string[] args)
        {
            Func<int, string> myFunc = GetNameandAge;
            Console.WriteLine(myFunc(30));
            Console.ReadKey();
        }
        public static string GetNameandAge(int age)
        {
            return "Kiran "+" " + age.ToString();
        }
   
    }

Next I am going this to Lambda,

  Func<int, string> myFunc = age=>"Kiran "+" "+ age.ToString();
   Console.WriteLine(myFunc(30));
  Console.ReadKey();

Now also the output would be,
Kiran 30.

 
The whole program looks like,
class Program
    {
        static void Main(string[] args)
        {
            Func<int, string> myFunc = a=>"Kiran "+" "+a.ToString();
            Console.WriteLine(myFunc(30));
            Console.ReadKey();
        }

I will explain each part in this lambda,

Here “age” in left side of lambda is the same as the parameter in the previous method. Compiler automatically identify the the type of “age” as “int” and the right side of lambda is same as the return part of the previous method. Compiler is enough brilliant to understand all.


Another example,
Func<string, bool> myFunc

This is another  “Func”. The meaning of this “Func” is, with this I can point to any method which accepts a “string” parameter and returns “bool”. So I can point a method like this to this “Func”
public static bool MethodReturnBool(string name)
        {
            return name == "Kiran";
        }

The whole program looks like,
static void Main(string[] args)
        {
            Func<string, bool> myFunc = MethodReturnBool;
            Console.WriteLine(myFunc("Kiran"));
            Console.ReadKey();
        }

        public static bool MethodReturnBool(string name)
        {
            return name == "Kiran";
    }
Output is  “True”

Now lambda for the same,

Func<string, bool> myFunc = name => name == "Kiran";

Here “name” is the input parameter and  name == "Kiran" which returns bool value.

In C# 4.0 Func have 17 overloads. See the picture below.


“Func” must return something but can have input parameter or not.







Action<>

Action is another generic general purpose delegate. With “Action<>” we can point to any method which returns nothing means method with return void and input parameters as per delegate signature. For example I can point below method with “Action<>” delegate.,

public static void PrintSum(int a, int b)
        {
            Console.WriteLine(a + b);
        }
Whole program looks like,

using System;
namespace ConsoleApplication5
{
    class Program
    {
        static void Main(string[] args)
        {
            Action<int, int> sum = PrintSum;
            sum (100, 150);
            Console.ReadKey();
        }

        public static void PrintSum(int a, int b)
        {
            Console.WriteLine(a + b);
        }
    }
}

In the example see Action<int,int> which means with this delegate I can point any method with return type void and have 2 input parameters of type int.

Output would be “250”.

Now am going to write the same  with lambda expression.


Action<int, int> sum = (a, b) => Console.WriteLine(a + b);

Here “a” and “b” are input parameters of type “int” just like the method we wrote before. Right side of lambda is the action need to execute just like Console.WriteLine we saw in the “PrintSum” method. Here right side of lambda is an action because Action<> delegate will not return anything. The whole program looks like,

using System;

namespace ConsoleApplication5
{
    class Program
    {
        static void Main(string[] args)
        {
            Action<int, int> sum = (a, b) => Console.WriteLine(a + b);
            sum(100, 150);
            Console.ReadKey();
        }
    }
}
Now also the output would be “250”



Predicate<>

Predicate is another generic general purpose delegate. With we can point to any method which have one input parameter and returns bool. So I can point the method,

public static bool PointByPredicate(int num)
        {
            return num == 100;
        }
With predicate. Whole program looks like,

using System;

namespace PredicateExample
{
    class Program
    {
        static void Main(string[] args)
        {
            Predicate<int> myPredicate = PointByPredicate;
            Console.WriteLine(myPredicate(100));
            Console.ReadKey();
        }

        public static bool PointByPredicate(int num)
        {
            return num == 100;
        }
    }
}

Output would be “True”.

Now am writing the same in “Lambda”,

Predicate<int> myPredicate = num => num == 100;

Here “num” is the input parameter to the lambda just as the input parameter in “PointByPredicate” method above.
Right side of lambda (num==100) is checking for num value. If num is hundred then return “True” else return “False”. This is same as return num == 100; in “PointByPredicate” method above. The whole program looks like,


using System;

namespace PredicateExample
{
    class Program
    {
        static void Main(string[] args)
        {
            Predicate<int> myPredicate = num => num == 100;
            Console.WriteLine(myPredicate(100));
            Console.ReadKey();
        }
    }
}

Yes you are right output is “True” again.
Predicate<>

Predicate is another generic general purpose delegate. With we can point to any method which have one input parameter and returns bool. So I can point the method,

public static bool PointByPredicate(int num)
        {
            return num == 100;
        }
With predicate. Whole program looks like,

using System;

namespace PredicateExample
{
    class Program
    {
        static void Main(string[] args)
        {
            Predicate<int> myPredicate = PointByPredicate;
            Console.WriteLine(myPredicate(100));
            Console.ReadKey();
        }

        public static bool PointByPredicate(int num)
        {
            return num == 100;
        }
    }
}

Output would be “True”.

Now am writing the same in “Lambda”,

Predicate<int> myPredicate = num => num == 100;

Here “num” is the input parameter to the lambda just as the input parameter in “PointByPredicate” method above.
Right side of lambda (num==100) is checking for num value. If num is hundred then return “True” else return “False”. This is same as return num == 100; in “PointByPredicate” method above. The whole program looks like,


using System;

namespace PredicateExample
{
    class Program
    {
        static void Main(string[] args)
        {
            Predicate<int> myPredicate = num => num == 100;
            Console.WriteLine(myPredicate(100));
            Console.ReadKey();
        }
    }
}

Yes you are right output is “True” again.

LINQ Operators 
LINQ operators are extension method, extend IEnumerable. So we will get LINQ operators because all collection internally implements IEnumerable interface. we can categorize LINQ operators into these categories,

Restriction Operators
Projection Operators
Partitioning Operators 
Join Operators 
Concatenation Operators
Ordering Operators 
Grouping Operators 
Set Operators
Conversion Operators
Equality Operators
Element Operators
Generation Operators
Quantifiers
We will discuss one by one.

Restriction Operators

Where

The Where operator filters a sequence(sequence is a collection and element is each element in that sequence) based on a predicate(a bool condition). Where has two overloads,


see the parameter of "Where" in the above image,

"this IEnumerable<TSource> source " meaning of this is "Where" is an extension method which extend the interface IEnumerable<>. so we will get "Where" with any type which implements IEnumerable<> interface. "this IEnumerable<TSource> source" is not a method parameter.

Func<TSource,bool> predicate- means this is a method parameter of delegate type and that delegate expects a method with input parameter  TSource (it would be a collection of any type) and return a bool value. which means a method like this,
 
 
public static bool FuncMethod(IEnumerable<string> names)
        {
            return true;
        }
or we can write Lambda for the same as,
Func<IEnumerable<string>, bool> fun = a => true;.

We will go ahead with Lambda expression.
I will explain "Where" with an example. Here i am writing a name collection,

List<string> names = new List<string> { "Sreelas", "Serosh", "Mohandas", "Shareef", "Ansar" };

            var filteredNames = names.Where(a => a == "Serosh");

            foreach (var filteredName in filteredNames)
            {
                Console.WriteLine(filteredName);
            }

see the line a => a == "Serosh" this is "Func" because it has a input parameter and return a value. it matching with the signature of "Func" in "Where" operator. 

we can avoid the "foreach" loop with,


names.Where(a => a == "Serosh").ToList().ForEach(b => Console.WriteLine(b));
will discuss the above way in depth later.
so where returns the filtered according to condition. So our output would be "Serosh". 


Next I will explain with another example. Say I have an Associate class as below,

And I have a collection of associate entities as below,

Now I want  output of associate where AssociateId=101 and print that output to console. We can write filtering and printing in a single line as,

Above code is nothing but complete lambda expression form of,




 The output would be,
 
Let us check "Where" operator in this example ,


So "Func" in the "Where" operator of this example expecting an input parameter of type "Associate" and returns "bool that why we implemented Lambda as,
a => a.AssociateId == 101, here "a" is the input parameter of type "Associate" and "a.AssociateId == 101" returns bool.
when we disassemble "System.LINQ" dll for "Where" it looks like,


In the above lambda the variable "a" is of type "Associate" that's why we get properties of Associate class with "a".  Compiler can understand the type of "a" by looking the type of the collection, "associates". "associates" is a collection of class "Associate" so compiler can understand the type of "a" is Associate.
Projection Operator- Select and Select Many 

Select Operator
"Select" LINQ operator selects element(element means each value in a collection) or elements from a sequence (collection). "Select" returns IEnumerable. “Select” have two overloads. I will explain the structure of “Select” as below,






Let us see the overload that I pointed.

 Here “this IEnumerable<TResult> Source” means that “Select”  is an extension method extend IEnumerable type.

“Func<Associate,TResult>” is the method parameter of “Select” method. This Func can point to any method which has one input parameter of type Associate and returns TResult( TResult can be any type). If difficult to follow what I mean please read my post on delegate and Func.
So I can point the method,

public string  SelectData(Associate associate)
        {
            return associate.FirstName;
        }

with the Func in Select method. So when select method asking for Func I can pass below method as it matches the signature of Func in select method. Now I can write “Select” as 

IEnumerable<string> selectAssociate = associates.Select(SelectData);
            foreach (var associate in selectAssociate)
            {
                Console.WriteLine(associate);
            }
            Console.ReadKey();

The whole program looks like,

using System;
using System.Collections.Generic;
using System.Linq;

namespace funcexamples
{
    class MyFuncClass
    {
        static void Main(string[] args)
        {
            List<Associate> associates = new List<Associate>
                                             { new Associate {AssociateId = 100,FirstName="Sreelas",LastName="Sreekumar",Location="Cochin"},
                                                 new Associate{AssociateId=101,FirstName="Serosh",LastName="Vikraman",Location="Trivandrum"},
                                                 new Associate{AssociateId=102,FirstName="Mohandas",LastName="PK",Location="Banglore"}
                                           };

            IEnumerable<string> selectAssociate = associates.Select(SelectData);
            foreach (var associate in selectAssociate)
            {
                Console.WriteLine(associate);
            }
            Console.ReadKey();
        }

        public static string SelectData(Associate associate)
        {
            return associate.FirstName;
        }
    }

    class Associate
    {
        public int AssociateId { get; set; }

        public string FirstName { get; set; }

        public string LastName { get; set; }

        public string Location { get; set; }
    }
}

Next I will explain the same implementation using Lambda expression.

Lambda expression for the above is,




In the above image why i have given IEnumerable<string> to store data is because we are returning FirstName only from the collection and it's type is string. if you want to return more than one value from the Associate collection with “Select” it should be,

 So select operator is just to Project data from a collection.


SelectMany Operator


SelectMany is another projection operator in LINQ. It is nothing but simply join one one more collection and fetch data from each collection. I will explain with an example. Consider a structure “Associate” and “Project” collection and i want to get data from both table. Please find the full code below and will explain step by step,
SelectMany have this many oveload as shown,



In the previous example i wrote lambda expression for third overload that i pointed. i will explain that lambda,



            var joinedResult = associates.Where(a => a.ProjectId == 1).SelectMany(associate => projects,
                                      (associate, project) =>
                                      new
                                          {
                                              associate.AssociateId,
                                              project.ProjectId,
                                              associate.FirstName,
                                              associate.LastName,
                                              project.ProjectName
                                          }).Where(a => a.ProjectId == 1);




Leave the "Where" in the query as i already discussed about "Where" in previous blog. Let see "SelectMany". First parameter to this "SelectMany" is "Func<TSource, IEnumerable<TCollection>>" which means this Func accept one parameter of type “TSource”(can be any type) and returning  IEnumerable<TCollection>, means collection of any type. So we implemented it as,
associate => projects

here “ associate” is of type “Associate” and “projects” is a collection that we declared outside  (Please check the screen shot of the whole program above). Here “associate” is input type and “projects” is the return collection (from this only second parameter Func get the input types) .
Now the second parameter,
Func<TSource, TCollection, TResult> resultSelector, which means this Func have two input parameters(TSource, TCollection)  and return TResult so I implemented it as,

(associate, project) =>   new
                                          {
                                              associate.AssociateId,
                                              project.ProjectId,
                                              associate.FirstName,
                                              associate.LastName,
                                              project.ProjectName
                                          }
Compiler expecting the first parameter  in this Func of type “Associate” and second parameter a collection of type “Project”  so here  associate is of type “Associate” and  project is of type “Project” (TCollection means type of collection). Right side of lambda is returning data.
This is the meaning of that overload. If  you get this then you can write the remaining overloads.

Partitioning Operator
Partitioning operators are,
Take
Skip
TakeWhile
SkipWhile.
Let’s begin with “Take”.

Take

Take operators takes given number of elements from a sequence (collection). I will explain with an example,
using System;
using System.Linq;

namespace ProjectionOperator
{
    class Program
    {
        static void Main(string[] args)
        {
            int[] numbers = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

            var num = numbers.Take(5);
            foreach (var number in num)
            {
                Console.WriteLine(number);
            }
            Console.ReadKey();
        }
    }
}

In the above example we have an integer array with number 1-10. Here “Take” operator take first five elements and skip the remaining. Simple.








No comments:

Post a Comment