.Net Programming - Financial independence

Using Join With LINQ in C#

In this post we’ll have a look at how to use the join keyword with LINQ (Language INtegrated Query) in C#. You’ll see that it’s not more complicated than using it in SQL at all and the difference is here you can leverage its power on collections of data in memory in addition to XML, SQL, JSon and so on!

What’s a join exactly? Well it combines two or more sets of data into one in a specified way. Sounds simple? It is but there are multiple ways to get to the same result and that’s what makes it a bit more complex. One advice I can give before starting is to always try to keep it as simple and easy to read as you can.

In the case of the join in LINQ, it is on it’s own, a inner join. In other words it will grab what is common to the joined collections and create another collection with that data. In this post, we’ll have a look at the inner join and the left (outer) join.

1. Setup!

Using joins in LINQ won’t be more complicated than an SQL query that’s for sure and will do an awesome job! If you`ve never used SQL before, don’t worry.

First, create a new .Net Framework console application in Visual Studio. I strongly recommend .Net Framework 4.8+ or you can use .Net Code 3.0+ as well. But in this examples I’m going to use .Net Framework 4.8.

We’ll create two collections to merge together using the join keyword. These collections will contain objects, for that, we’ll need a new classes. Create a new class named Car and one named Buyer in your project just like below. We will use the Car type for the inner join part and add the Buyer type in the left outer join part.

Car.cs

namespace TutorialLinqCSharp
{
    public class Car
    {
        public string VIN { get; set; }
        public string Brand { get; set; }
        public string Model { get; set; }

        public override string ToString()
        {
            return $"Vehicle identification number: {VIN}, Brand: {Brand}, Model: {Model}";
        }
    }
}

Buyer.cs

namespace TutorialLinqCSharp
{
    public class Buyer
    {
        public string Name { get; set; }
        public int Age { get; set; }
        public string CarVIN { get; set; }

        public override string ToString()
        {
            return $"Name: {Name}, Age: {Age}, Car VIN: {CarVIN}";
        }
    }
}

Program.cs

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace TutorialLinqCSharp
{
    class Program
    {
        static void Main(string[] args)
        {

        }
    }
}

Note: the name of my project is TutorialLinqCSharp, yours might be different so you’ll want to make sure the namespace is correct. In other words, replace TutorialLinqCSharp with the namespace you’ve chosen/the name of your project.

2. The Inner Join

What the Inner Join Looks Like

Before we begin this part, here is a visual representation of the inner join. As you can see, we’re going to go fetch what is similar on the two collections. What LINQ considers similar will be determined by the ON statement.

Inner Join

Syntax

Now we’re ready to add the two collections to our Program.cs file. We will have two List collections with 3 and 4 cars respectively which should be enough for our demonstration. Have a great look at the cars and how they’re different or similar as this is going to have an impact when merging both lists.

Program.cs

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace TutorialLinqCSharp
{
    class Program
    {
        static void Main(string[] args)
        {
            // We use Lists for the convenience because we can add/remove items easily.
            List<Car> dealershipInventoryOne = new List<Car>();
            List<Car> dealershipInventoryTwo = new List<Car>();

            // Fill up the inventory of Dealership One.
            dealershipInventoryOne.Add(new Car() { VIN = "JEQODSM21893211", Brand = "Toyota", Model = "86" });
            dealershipInventoryOne.Add(new Car() { VIN = "JGFFFEE32132134", Brand = "Subaru", Model = "BRZ" });
            dealershipInventoryOne.Add(new Car() { VIN = "JDSAJIS54355432", Brand = "Scion", Model = "FRS" });

            // Fill up the inventory of Dealership Two.
            dealershipInventoryTwo.Add(new Car() { VIN = "JQWKOSU32132132", Brand = "Subaru", Model = "BRZ" });
            dealershipInventoryTwo.Add(new Car() { VIN = "JDSOUWE54354444", Brand = "Toyota", Model = "GT86" });
            dealershipInventoryTwo.Add(new Car() { VIN = "JJSOOQU12354567", Brand = "Toyota", Model = "Supra" });
            dealershipInventoryTwo.Add(new Car() { VIN = "JFFDFDS99699488", Brand = "Toyota", Model = "FRS" });
        }
    }
}

The next step will be to add the LINQ query with the join clause in it. Notice that this is not really a one size fits all process. You’ll need to assess the situation and come up with a solution on your own but, knowing and seeing the tools in actions on different scenarios will help you in your thought process.

// We join both lists using the join keyword. The on will specify which Property of the object "Car"
// will be used to identify similarity. In this case, we will take the object(s) that have common Model
// between the two collections.
IEnumerable<Car> sameModelCars = from Car inventoryItemOne in dealershipInventoryOne
                                 join Car inventoryItemTwo in dealershipInventoryTwo
                                 on inventoryItemOne.Model
                                 equals inventoryItemTwo.Model
                                 select inventoryItemOne;

As I explain in the comments I am joining both collections on the model which means that the resulting collection (of IEnumerable type) will contain the item(s) from the joined list that have the same model. Which is exactly what the inner join does: take similar items from both collections according to a specific property/properties.

Here is the code of Program.cs for the first part of the example. You can try to run it and see the result. Also you should put breakpoints in the code to follow the processing step by step.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace TutorialLinqCSharp
{
    class Program
    {
        static void Main(string[] args)
        {
            // We use Lists for the convenience because we can add/remove items easily.
            List<Car> dealershipInventoryOne = new List<Car>();
            List<Car> dealershipInventoryTwo = new List<Car>();

            // Dealership One inventory.
            dealershipInventoryOne.Add(new Car() { VIN = "JEQODSM21893211", Brand = "Toyota", Model = "86" });
            dealershipInventoryOne.Add(new Car() { VIN = "JGFFFEE32132134", Brand = "Subaru", Model = "BRZ" });
            dealershipInventoryOne.Add(new Car() { VIN = "JDSAJIS54355432", Brand = "Scion", Model = "FRS" });

            // Dealership Two inventory.
            dealershipInventoryTwo.Add(new Car() { VIN = "JQWKOSU32132132", Brand = "Subaru", Model = "BRZ" });
            dealershipInventoryTwo.Add(new Car() { VIN = "JDSOUWE54354444", Brand = "Toyota", Model = "GT86" });
            dealershipInventoryTwo.Add(new Car() { VIN = "JJSOOQU12354567", Brand = "Toyota", Model = "Supra" });
            dealershipInventoryTwo.Add(new Car() { VIN = "JFFDFDS99699488", Brand = "Toyota", Model = "FRS" });

            // We join both arrays using the join keyword. The on will specify which Property of the object "Car"
            // will be used to identify similarity. In this case, we will take the object(s) that have common Model
            // between the two collections.
            IEnumerable<Car> sameModelCars = from Car inventoryItemOne in dealershipInventoryOne
                                             join Car inventoryItemTwo in dealershipInventoryTwo
                                             on inventoryItemOne.Model
                                             equals inventoryItemTwo.Model
                                             select inventoryItemOne;

            // Output the collection in the console!
            OutputCollectionToConsole(sameModelCars);

            // Stop the execution of the console.
            Console.ReadKey();
        }

        /// <summary>
        /// This method outputs in the console any IEnumerable collection that contains object by calling the ToString() method 
        /// of each object in the collection.
        /// </summary>
        /// 
        /// <param name="collectionToOutput">The collection that will be output in the console.</param>
        private static void OutputCollectionToConsole(IEnumerable<object> collectionToOutput)
        {
            foreach (object collectionItem in collectionToOutput)
                Console.WriteLine(collectionItem.ToString());
        }
    }
}

Let’s Decompose the LINQ Query

I know I could have started with a simpler example, but I wanted to show something I consider more realistic than joining two lists of int. That said, if we go line by line on the LINQ query we used in the last code block:

IEnumerable<Car> sameModelCars = from Car inventoryItemOne in dealershipInventoryOne
                                 join Car inventoryItemTwo in dealershipInventoryTwo
                                 on inventoryItemOne.Model
                                 equals inventoryItemTwo.Model
                                 select inventoryItemOne;
  1. The from Car inventoryItemOne in dealershipInventoryOne will tell LINQ that we want one Car type variable from the dealershipInventoryOne List object.
  2. Next we join the second list (dealershipInventoryTwo) and indicates that the items in there will be placed in the inventoryItemTwo variable and are of type Car.
  3. We need to specify on which item the join will occur because LINQ needs to know which element(s) of the Car object we are looking to match. Here we want the Model to match. If we look at our two collections and remember what the inner join is about which is to find common elements, we can guest that the two common models are BRZ and FRS.
  4. Finally the select will specify what we will put in the IEnumerable sameModelCars variable. Here it will be the whole Car object but we could have selected a specific property like inventoryItemOne.Make. Then we would have needed to change the type of the IEnumerable to match the retrieved elements’ type.

Note: with LINQ, you have to use the equals keyword to specify the equality.

Below is the output of the query.

Inner join equals, output in console.

Notice that the VIN from the dealershipInventoryOne is displayed, that’s because we’ve selected the inventoryItemOne in the query, let’s select inventoryItemTwo instead and see the result. You can replace the query above by this one. The only difference is the selected item.

IEnumerable<Car> sameModelCars = from Car inventoryItemOne in dealershipInventoryOne
                                 join Car inventoryItemTwo in dealershipInventoryTwo
                                 on inventoryItemOne.Model equals inventoryItemTwo.Model
                                 select inventoryItemTwo;

Now let’s see what it gives us!

Output from the Console with InventoryItemTwo selected.

As you certainly noticed, you have the choice of which to select. I won’t proceed further on the select as it’s not the goal of this post but we will get to it in a future post. I want to show you something else before we switch to the Left outer join.

Inner Join on Multiple Values

Let’s say you want to join on more than just the Model. In fact I would also want to join on the Brand. We want the items in our newly joined collection to have the same brand and the same model as it’s an inner join. Then we can modify the on section of the LINQ query because it’s the part that checks equality between both collections.

IEnumerable<Car> sameModelCars = from Car inventoryItemOne in dealershipInventoryOne
                                 join Car inventoryItemTwo in dealershipInventoryTwo
                                 on new { inventoryItemOne.Model, inventoryItemOne.Brand }
                                    equals new { inventoryItemTwo.Model, inventoryItemTwo.Brand }
                                 select inventoryItemTwo;

Now it’s even more precise as only the Subaru BRZ appears! That’s because we had the Toyota FRS and Scion FRS which were the same model but not the same brand. For the equality to work it needs both the brand and the model being the same with the new query and only the ones matching from both collections will be in the result.

Output from the Console.

3. The Left Outer Join

What the Left Outer Join Looks Like

Left Join

As you can see, we are taking out of the two collections the same thing as the inner join plus what is in the first collection even if it doesn’t have any similarities (what’s specified in the on statement) with what’s the second collection. You’ll understand better with the example that follows.

Syntax

The Left Outer Join or Left Join is a bit more tricky than the inner join but can be useful in specific situations. What is to remember is that it’s going to take all the values in the first collection and the common values from the first and the second collection as I said above. If the second collection has elements (or properties) that are not in the first collection, they’re going to be replaced by null just like we would do with SQL.

For doing the left outer join, we’ll add buyers from which we’ve collected the VIN of their respective car. I created a new list like below.

List<Buyer> carBuyers = new List<Buyer>();
carBuyers.Add(new Buyer() { Name = "Bob", Age = 42, CarVIN = "AASDSAD12222213" });
carBuyers.Add(new Buyer() { Name = "Roger", Age = 27, CarVIN = "DSADSAD12222213" });
carBuyers.Add(new Buyer() { Name = "Alex", Age = 30, CarVIN = "JGFFFEE32132134" });
carBuyers.Add(new Buyer() { Name = "Sofia", Age = 36, CarVIN = "JDSAJIS54355432" });

Let’s say we start by creating our query just like we’ve done with the inner join and see what happens. I want to join buyers and cars on the VIN. Also, as you might have noticed already, some VIN specified in our new list don’t exist in the dealershipInventoryOne collection, look at the following query’s result.

Console.WriteLine("----Inner Join With Buyers----");

// Add buyers that own cars, we want to know if any of the buyers own a car on the 86 platform from dealershipInventoryOne.
List<Buyer> carBuyers = new List<Buyer>();
carBuyers.Add(new Buyer() { Name = "Bob", Age = 42, CarVIN = "AASDSAD12222213" });
carBuyers.Add(new Buyer() { Name = "Roger", Age = 27, CarVIN = "DSADSAD12222213" });
carBuyers.Add(new Buyer() { Name = "Alex", Age = 30, CarVIN = "JGFFFEE32132134" });
carBuyers.Add(new Buyer() { Name = "Sofia", Age = 36, CarVIN = "JDSAJIS54355432" });

var buyersWithCarInner = from Buyer buyer in carBuyers
                    join Car inventoryItemOn in dealershipInventoryOne
                    on buyer.CarVIN equals inventoryItemOn.VIN
                    select new
                    {
                        Buyer = buyer,
                        Car = inventoryItemOn
                    };

// Display the buyer and his/her car in the console.
foreach(var buyerWithCarInner in buyersWithCarInner)
{
    Console.WriteLine($"Buyer: {buyerWithCarInner.Buyer.ToString()}, Car: {buyerWithCarInner.Car.ToString()}");
}

Don’t let the select new discourage you, I am simply creating a new object on the fly that will contain the buyer object as property Buyer and the inventoryItemOne (Car) that correspond to the VIN the buyer has. In other words, an object containing the buyer and his/her car.

Here is the output in the console from adding the following two pieces of code to our Program.cs

Left join without grouping.

Well, that’s an inner join! It only returned the buyers that had a corresponding car in the inventoryOne! That’s not what we want because a left outer join should return all the elements in the first set and what’s common to both sets. To do that we’ll need to modify our query a little.

Now I Want a Real Left Outer Join!

Add the following code to your Program.cs. We will add grouping and this will give us null car values for the buyers that do not own a car in the dealershipInventoryOne list.

Console.WriteLine("----Left Join With Buyers----");

// Left outer join applied in the LINQ query.
var buyersWithCar = from Buyer buyer in carBuyers
                    join Car inventoryItemOn in dealershipInventoryOne
                    on buyer.CarVIN equals inventoryItemOn.VIN
                    into carGroup
                    from groupedCar in carGroup.DefaultIfEmpty(new Car()
                    {
                        VIN = buyer.CarVIN,
                        Brand = "Unknown",
                        Model = "Unknown"
                    })
                    select new
                    {
                        Buyer = buyer,
                        Car = groupedCar
                    };

// The car can be null since we don't know if the buyer has an 86 platform car from dealershipInventoryOne.
foreach (var buyerWithCar in buyersWithCar)
    Console.WriteLine($"Buyer: {buyerWithCar.Buyer.ToString()}, Car: {buyerWithCar.Car.ToString()}");

I want to draw your attention to the into carGroup. Here we have only one car possible because the buyer has one car associated to it, but it could be different if the buyer could own multiple cars. In our case, it’s only one or zero corresponding car in the dealershipInventoryOne.

Let’s Decompose the LINQ Query

  1. We start by getting the Buyer object from the carBuyers collection.
  2. We are then going to join the dealershipInventoryOne on the VIN of the buyer and the inventoryOne cars.
  3. So for every Car object joined to the Buyer object on the VIN, LINQ is going to create a group of those cars (in our case it’s zero or one that it will find because the buyer only has one VIN associated). The group is called carGroup.
  4. Then for every car in this group, LINQ will create an anonymous object (on the fly) containing the buyer and the corresponding car. If there is no car found for this buyer, LINQ will use the default car object we’ve created in the DefaultIfEmpty method.
  5. The resulting object is what is going in the buyersWithCar collection. Notice the use of Var as the type of the returned collection. That’s because we’re working with anonymous objects created on the fly so there is no defined type like Car or Buyer classes.

I want to recommend that you play with the code a little as I don’t consider this an easy concept to grasp.

Finally, here is the output in the console.

Left Outer Join Console Output.

Complete Program.cs Code

using System;
using System.Collections.Generic;
using System.Linq;

namespace TutorialLinqCSharp
{
    class Program
    {
        static void Main(string[] args)
        {
            // We use Lists for the convenience because we can add/remove items easily.
            List<Car> dealershipInventoryOne = new List<Car>();
            List<Car> dealershipInventoryTwo = new List<Car>();

            // Dealership One inventory.
            dealershipInventoryOne.Add(new Car() { VIN = "JEQODSM21893211", Brand = "Toyota", Model = "86" });
            dealershipInventoryOne.Add(new Car() { VIN = "JGFFFEE32132134", Brand = "Subaru", Model = "BRZ" });
            dealershipInventoryOne.Add(new Car() { VIN = "JDSAJIS54355432", Brand = "Scion", Model = "FRS" });

            // Dealership Two inventory.
            dealershipInventoryTwo.Add(new Car() { VIN = "JQWKOSU32132132", Brand = "Subaru", Model = "BRZ" });
            dealershipInventoryTwo.Add(new Car() { VIN = "JDSOUWE54354444", Brand = "Toyota", Model = "GT86" });
            dealershipInventoryTwo.Add(new Car() { VIN = "JJSOOQU12354567", Brand = "Toyota", Model = "Supra" });
            dealershipInventoryTwo.Add(new Car() { VIN = "JFFDFDS99699488", Brand = "Toyota", Model = "FRS" });

            Console.WriteLine("----Inner Join----");

            // We join both arrays using the join keyword. The on will specify which Property of the object "Car"
            // will be used to identify similarity. In this case, we will take the object(s) that have common Model
            // between the two collections.
            IEnumerable<Car> sameModelCars = from Car inventoryItemOne in dealershipInventoryOne
                                             join Car inventoryItemTwo in dealershipInventoryTwo
                                             on new { inventoryItemOne.Model }
                                                equals new { inventoryItemTwo.Model }
                                             select inventoryItemTwo;

            // Output the collection in the console!
            OutputCollectionToConsole(sameModelCars);

            Console.WriteLine("----Inner Join With Buyers----");

            // Add buyers that own cars, we want to know if any of the buyers own a car on the 86 platform from dealershipInventoryOne.
            List<Buyer> carBuyers = new List<Buyer>();
            carBuyers.Add(new Buyer() { Name = "Bob", Age = 42, CarVIN = "AASDSAD12222213" });
            carBuyers.Add(new Buyer() { Name = "Roger", Age = 27, CarVIN = "DSADSAD12222213" });
            carBuyers.Add(new Buyer() { Name = "Alex", Age = 30, CarVIN = "JGFFFEE32132134" });
            carBuyers.Add(new Buyer() { Name = "Sofia", Age = 36, CarVIN = "JDSAJIS54355432" });

            var buyersWithCarInner = from Buyer buyer in carBuyers
                                     join Car inventoryItemOn in dealershipInventoryOne
                                     on buyer.CarVIN equals inventoryItemOn.VIN
                                     select new
                                     {
                                         Buyer = buyer,
                                         Car = inventoryItemOn
                                     };

            foreach (var buyerWithCarInner in buyersWithCarInner)
                Console.WriteLine($"Buyer: {buyerWithCarInner.Buyer.ToString()}, Car: {buyerWithCarInner.Car.ToString()}");

            Console.WriteLine("----Left Join With Buyers----");

            // Left outer join applied in the LINQ query.
            var buyersWithCar = from Buyer buyer in carBuyers
                                join Car inventoryItemOn in dealershipInventoryOne
                                on buyer.CarVIN equals inventoryItemOn.VIN
                                into carGroup
                                from groupedCar in carGroup.DefaultIfEmpty(new Car()
                                {
                                    VIN = buyer.CarVIN,
                                    Brand = "Unknown",
                                    Model = "Unknown"
                                })
                                select new
                                {
                                    Buyer = buyer,
                                    Car = groupedCar
                                };

            // The car can be null since we don't know if the buyer has an 86 platform car from dealershipInventoryOne.
            foreach (var buyerWithCar in buyersWithCar)
                Console.WriteLine($"Buyer: {buyerWithCar.Buyer.ToString()}, Car: {buyerWithCar.Car.ToString()}");

            // Stop the execution of the console.
            Console.ReadKey();
        }

        /// <summary>
        /// This method outputs in the console any IEnumerable collection that contains object by calling the ToString() method 
        /// of each object in the collection.
        /// </summary>
        /// 
        /// <param name="collectionToOutput">The collection that will be output in the console.</param>
        private static void OutputCollectionToConsole(IEnumerable<object> collectionToOutput)
        {
            foreach (object collectionItem in collectionToOutput)
                Console.WriteLine(collectionItem.ToString());
        }
    }
}

Why Am I Not Showing More Join Types?

In addition to my effort to keep tutorials simple and kind of short, you’ll see that by using more LINQ methods like Union and Except, you can do a lot of things that don’t require joins at all. I think that the inner join and the left join are the most useful and should be enought for 90 %+ of the cases using LINQ.

If you need more information on the Join statement with LINQ? You can refer to the Microsoft Docs on the subject. Same if you want more information on the groups in LINQ!

Happy coding everyone!

Found an error in the code? Please send me a message so I can fix it as soon as possible!

YouTube

Inner Join

Previous article How to Change the Interface Language in Visual Studio

Related posts

0 Comments

No Comments Yet!

You can be first to comment this post!

Leave a Comment

Your data will be safe! Your e-mail address will not be published. Also other data will not be shared with third person. Required fields marked as *