Simon Online

2016-11-09

C# Wildcards/Discards/Ignororators

There is some great discussion going on about including discard variables in C#, possibly even for the C# 7 timeframe. It is so new that the name for them is still up in the air. In Haskel it is called a wildcard. I think this is a great feature which is found in other languages but isn’t well known for people who haven’t done funcitonal programming. The C# language has been sneaking into being a bit more functional over the last few releases. There is support for lambdas and there has been a bunch of work on immutability. Let’s take a walk through how wildcards works.

Let’s say that we have a function which has a number of output paramaters:

1
void DoSomething(out List<T> list, out int size){}

Ugh, already I hate this method. I’ve never liked the out syntax because it is wordy. To use this function you would have to do

1
2
3
List<T> list = null;
int size = 0;
DoSomething(out list, out size);

There is some hope for that syntax in C# 7 with what I would have called inline declaration of out variables but is being called “out variables”. The syntax would look like

1
DoSomething(out List<T> list, out int size);

This is obviously much nicer and you can read a bit more about it at
https://blogs.msdn.microsoft.com/dotnet/2016/08/24/whats-new-in-csharp-7-0/

However in my code base perhaps I don’t care about the size parameter. As it stands right now you still need to declare some variable to hold the size even if it never gets used. For one variable this isn’t a huge pain. I’ve taken to using the underscore to denote that I don’t care about some variable.

1
2
DoSomething(out List<T> list, out int _);
//make use of list never reference the _ variable

The issue comes when I have some funciton which takes many parameters I don’t care about.

1
2
DoSomething(out List<T> list, out int _, out float __, out decimal ___);
//make use of list never reference the _ variables

This is a huge bit of uglyness because we can’t overload the _ variable so we need to create a bunch more variables. It is even more so ugly if we’re using tuples and a deconstructing declaration (also part of C# 7). Our funciton could be changed to look like

1
(List<T>, int, float, decimal) DoSomething() {}

This is now a function which returns a tuple containing everything we previously had as out prameters. Then you can break this tuple up using a deconstructing declaration.

1
(List<T> list, int size, float fidelity, decimal cost) = DoSomething();

This will break up the tuple into the fields you actually want. Except you don’t care about size, fidelity and cost. With a wildcard we can write this as

1
(List<T> list, int _, float _, decimal _) = DoSomething();

This beauty of this wildcard is that we can use the same wildcard for each field an not worry about them in the least.

I’m really hopeful that this feature will make it to the next release.

2016-07-20

Can't connect to windows docker daemon

I updated my Windows machine to the latest version on the fast ring to get some access to awesome Windows container goodness. I followed the instructions at Microsoft’s MSDN but I got stuck trying to connect to the docker daemon to import the image.

1
2
C:\Users\Simon> docker load -i nanoserver.tar.gz
An error occurred trying to connect: Post http://localhost:2375/v1.21/images/load: dial tcp 127.0.0.1:2375: ConnectEx tcp: No connection could be made because the target machine actively refused it.

Turns out the solution is to put a file in c:\programdata\docker\config\daemon.json and inside that file put

1
2
3
{
"hosts": ["tcp://0.0.0.0:2375"]
}

This will listen on any interface on port 2375. You might do better to put in

1
2
3
{
"hosts": ["tcp://127.0.0.1:2375"]
}

which will at least limit connections to your local machine. Now everything else in the tutorial works as it should.

2016-07-17

An Intro to NGINX for Kestrel

Kestrel is a light weight web server for hosting ASP.NET Core applications on really any platform. It is based on a library called libuv which is an eventing library and it, actually, the same one used by nodejs. This means that it is an event driven asynchronous I/O based server.

When I say that Kestrel is light weight I mean that it is lacking a lot of the things that an ASP.NET web developer might have come to expect from a web server like IIS. For instance you cannot do SSL termination with Kestrel or URL rewrites or GZip compression. Some of this can be done by ASP.NET proper but that tends to be less efficient than one might like. Ideally the server would just be responsbile for running ASP.NET code. The suggested approach not just for Kestrel but for other light weight front end web servers like nodejs is to put a web server in front of it to handle infrastructure concerns. One of the better known ones is Nginx (pronounced engine-X like racer X).

https://www.nginx.com/wp-content/themes/nginx-theme/assets/img//logo.png

Nginix is a basket full of interesting capabilities. You can use it as a reverse proxy; in this configuration it takes load off your actual web server by preserving a cache of data which it serves before calling back to your web server. As a proxy it can also sit in front of multiple end points on your server and make them appear to be a single end point. This is useful for hiding a number of microservices behind a single end point. It can do SSL termination which makes it easy to add SSL to your site without having to modify a single line of code. It can also do gzip compression and serve static files. The commercial version of Nginx adds load balancing to the equation and a host of other things.

Let’s set up Nginx in front of Kestrel to provide gzip support for our web site. First we’ll just create a new ASP.NET Core web application.

1
yo aspnet

Select Web Application and then bring it up with

1
2
dotnet restore
dotnet run

This is running on port 5000 on my machine and hitting it with a web browser reports the content-encoding as regular, no gzip.

No gzip content encoding

That’s no good, we want to make sure our applications are served with gzip. That will make the payload smaller and the application faster to load.

Let’s set up Nginx. I installed my copy through brew (I’m running on OSX) but you can just as easily download a copy from the Nginx site. There is even support for Windows although the performance there is not as good as it in on *NIX operating systems. I then set up a nginx.conf configuraiton file. The default config file is huge but I’ve trimmed it down here and annotated it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
#number of worker processes to spawn
worker_processes 1;

#maximum number of connections
events {
worker_connections 1024;
}

#serving http information
http {
#set up mime types
include mime.types;
default_type application/octet-stream;

#set up logging
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /Users/stimms/Projects/nginxdemo/logs/access.log main;

#uses sendfile(2) to send files directly to a socket without buffering
sendfile on;

#the length of time a connection will stay alive on the server
keepalive_timeout 65;

#compress the response stream with gzip
gzip on;

#configure where to listen
server {
#listen over http on port 8080 on localhost
listen 8080;
server_name localhost;

#serve static files from /Users/stimms/Projects/nginxdemo for requests for
#resources under /static
location /static {
root /Users/stimms/Projects/nginxdemo;
}

#by default pass all requests on / over to localhost on port 5000
#this is our Kestrel server
location / {
proxy_pass http://127.0.0.1:5000/;
}

}
}

With this file in place we can load up the server on port 8080 and test it out.

nginx -c /Users/stimms/Projects/nginxdemo/nginx.conf

I found I had to use full paths to the config file or nginx would look in its configuration directory.

Don’t forget to also run Kestrel. Now when pointing a web browser at port 8080 on the local host we see

Content-encoding gzip enabled

Content-encoding now lists gzip compression. Even on this small page we can see a reduction from 8.5K to 2.6K scaled over a huge web site this would be a massive savings.

Let’s play with taking some more load off the Kestrel server by caching results. In the nginx configuration file we can add a new cache under the http configuration

1
2
3
#set up a proxy cache location
proxy_cache_path /tmp/cache levels=1:2 keys_zone=aspnetcache:8m max_size=1000m inactive=600m;
proxy_temp_path /tmp/cache/temp;

This sets up a cache in /tmp/cache of size 8MB up to 1000MB which will become inactive after 600 minutes (10 hours). Then under the listen directive we’ll add some rules about what to cache

1
2
3
4
#use the proxy to save files
proxy_cache aspnetcache;
proxy_cache_valid 200 302 60m;
proxy_cache_valid 404 1m;

Here we cache 200 and 302 respones for 60 minutes and 404 responses for 1 minute. If we add these rules and restart the nginx server

1
nginx -c /Users/stimms/Projects/nginxdemo/nginx.conf -s reload

Now when we visit the site multiple times the output of the Kestrel web server shows it isn’t being hit. Awesome! You might not want to cache everything on your site and you can add rules to the listen directive to just cache image files, for instance.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#just cache image files, if not in cache ask Kestrel
location /images/ {
#use the proxy to save files
proxy_cache aspnetcache;
proxy_cache_valid 200 302 60m;
proxy_cache_valid 404 1m;
proxy_pass http://127.0.0.1:5000;
}

#by default pass all requests on / over to localhost on port 5000
#this is our Kestrel server
location / {
proxy_pass http://127.0.0.1:5000/;
}

While Kestrel is fast it is still slower than Nginx at serving static files so it is worthwhile offloading traffix to Nginx when possible.

Nginx is a great deal of fun and worth playing with. We’ll probably revisit it in future and talk about how to use it in conjunction with microservices. You can find the code for this post at https://github.com/AspNetMonsters/Nginx.

2016-06-11

How I fixed OneDrive like Mark Russinovich

Fellow Monster David Paquette sent me a link to a shared OneDrive folder today with some stuff in it. Clicking on the link I was able to add it to my OneDrive. The dialog told me files would appear on my machine soon. So I waited.

After an outrageously long time, 37 seconds, the files weren’t there and I went hunting to find out why. As it turns out OneDrive wasn’t even running. That’s suppose to be a near imposiblity in Windows 10 so I hopped on the Interwebernets to find out why. Multiple sources suggested solutions like clearing my credentials and running OneDrive.exe /reset. Of course none of them worked.

Something was busted.

Running the OneDrive executable didn’t bring up the UI it didn’t do any of the things the Internet told me it should. My mind went back to when I was setting up my account on this computer and how I fat fingered stimm instead of stimms as my user name. Could it be the OneDrive was trying to access some files that didn’t exist?

Channeling my inner Mark Russinovich I opened up ProcessMonitor a fantastic tool which monitors file system and registry access. You can grab your own copy for free from https://technet.microsoft.com/en-us/sysinternals/bb896645.aspx.

In the UI I added filters for any process with the word “drive” in it and then filtered out “google”. I did this because I wasn’t sure if the rename from skydrive to onedrive had missed anything. Then I ran the command line to start up OneDrive again.

Process monitor found about 300 results before the process exited. Sure enough as I went through the file accesses I found
http://i.imgur.com/soAh4PR.png
Sure enough OneDrive is trying to create files inside of a directory which doesn’t exist. Scrolling further up I was able to find some references to values in the registry under HKCU\SOFTWARE\Microsoft\OneDrive which, when I opened them up, contained the wrong paths. I corrected them
http://i.imgur.com/arhWYgt.png
And with that in place was able to start up OneDrive successfully again and sync down the pictures of cats that David had sent me.

The story here is that it is possible, and even easy, to figure out why a compiled application on your machine isn’t working. By examining the file and registry accesses it is making you might be able to suss out what’s going wrong and fix it.

2016-05-06

CI with F# SQL Type Providers

My experimentation with F# continues. My latest challenge has been figuring out how to get SQL type providers to work with continuous integration. The way that SQL type providers work (and I’m speaking broadly here because there are about 4 of them) is that they examine a live database and generate types from it. On your local machine this is a perfect set up because you have the database locally to do development against. However on the build server having a database isn’t as likely.

In my particular case I’m using Visual Studio Online or TFS Online or whatever the squid it is called these days. Visual studio team services that’s what it is called.

Screenshot of VSTS

I’m using a hosted build agent which may or may not have a database server on it - at least not one that I really want to rely on. I was tweeting about the issue and Dmitry Morozov (who wrote the type provider I’m using - the F# community on twitter is amazing) suggested that I just make the database part of my version control. Of course I’m already doing that but in this project I was using EF migrations. The issue with that is that I need to have the database in place to build the project and I needed to build the project to run the migrations… For those who are big into graph theory you will have recognized that there is a cycle in the dependency graph and that ain’t good.

Graph cycles

EF migrations are kind of a pain, at least that was my impression. I checked with Canada’s Julie Lerman, David Paquette, to see if maybe I was just using them wrong.

Discussion with Dave Paquette

So I migrated to roundhouse which is a story for another post. With that in place I set up a super cheap database in azure and I hooked up the build process to update that database on every deploy. This is really nice because it catches database migration issues before the deployment step. I’ve been burned by migrations which locked the database before on this project and now I can catch them against a low impact database.

One of the first step in my build process is to deploy the database.
Build process

In my F# I have a setting module which holds all the settings and it includes

1
2
3
module Settings = 
[<Literal>]
let buildTimeConnectionString = "Server=tcp:builddependencies.database.windows.net,1433;Database=BuildDependencies;User ID=build@builddependencies;Password=goodtryhackers;Encrypt=True;TrustServerCertificate=False;Connection Timeout=30;"

And this string is used throughout my code when I create the SQL based types

1
type Completions = SqlProgrammabilityProvider<Settings.buildTimeConnectionString>

and

1
2
3
let mergeCommand = new SqlCommandProvider<"""
merge systems as target
...""", Settings.buildTimeConnectionString>(ConnectionStringProvider.GetConnection)

In that second example you might notice that the build time connection string is different from the run time connection string which is specified as a parameter.

##How I wish it worked

For the most part having a database build as part of your build process isn’t a huge deal. You need it for integration tests anyway but it is a barrier for adoption. It would be cool if you could check in a serialized version of the schema and, during CI builds, point the type provider at this serialized version. This serialized version could be generated on the developer workstations then checked in. I don’t think it is an ideal solution and now I’ve done the footwork to get the build database I don’t think I would use it.

2016-04-26

Running your app on Windows Server Core Containers

Most of the day I work on an app which makes use of NServiceBus. If you’ve ever talked to me about messaging, then you know that I’m all over messaging like a ferret in a sock.
Sock Ferret

So I’m, understandibly, a pretty big fan of NServiceBus - for the most part. The thing with architecting your solution to use SOA or Microservices or whatever we’re calling it these days is that you end up with a lot of small applications. Figuring out how to deploy these can be a bit of a pain as your system grows. One solution I like is to make use of the exciting upcoming world of containers. I’ve deployed a few ASP.NET Core applications to a container but NServiceBus doesn’t work on dotnet Core so I need to us a Windows container here.

First up is to download the ISO for Windows Server Core 2016 from Microsoft. You can do that for free here. I provisioned a virtual box VM and installed Windows using the downloaded ISO. I chose to use windows server core as opposed to the version of windows which includes a full UI. The command line was good enough for Space Quest II and by gum it is good enough for me.

Starting up this vm gets you this screen
Imgur

Okay, let’s do it. Docker isn’t installed by default but there is a great article on how to install it onto an existing machine here. In short I ran

powershell.exe

Which started up powershell for me (weird that powershell isn’t the default shell). Then

1
2
wget -uri https://aka.ms/tp4/Install-ContainerHost -OutFile C:\Install-ContainerHost.ps1
& C:\Install-ContainerHost.ps1

I didn’t specify the -HyperV flag as in the linked article because I wanted Docker containers. There are two flavours of containers on Windows at the moment. HyperV containers which are heavier weight and Docker containers which are lighter. I was pretty confident I could get away with Docker containers so I started with that. The installer took a long, long time. It had to download a bunch of stuff and for some reason it decided to use the background downloader which is super slow.

Slowwwww

By default, the docker daemon only listens on 127.0.0.1 which means that you can only connect to it from inside the virtual machine. That’s not all that useful as all my stuff is outside of the virtual machine. I needed to do a couple of things to get that working.

The first was to tell docker to listen on all interfaces. Ideally you shouldn’t allow docker to bind to external interfaces without the TLS certificates installed. That was kind of a lot of work so I ignored the warning in the logs that it generates

1
/!\\ DONT' BIND ON ANY IP ADDRESS WITHOUT setting -tlsverify IF YOU DON'T KNOW WHAT YOU'RE DOING /!\\

Yeah that’s fine. To do this open up the docker start command and tell it to listen on the 0.0.0.0 interface.

1
notepad c:\programdata\docker\runDockerDaemon.cmd

Now edit the line

1
docker daemon -D -b "Virtual Switch"

to read

1
docker daemon -D -b "Virtual Switch" -H 0.0.0.0:2376

Now we need to relax the firewall rules or, in my case, turn off the firewall completely.

1
Set-NetFirewallProfile -name * -Enabled "false"

Now restart docker

1
2
net stop docker
net start docker

We should now be able to access docker from the host operating system. And indeed I can by specifying the host to connect to when using the docker tools. In my case on port 2376 on 192.168.0.13

1
2
docker -H tcp://192.168.0.13:2376 ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

Finally, we can actually start using docker.

I hammered together a quick docker file which sucked in the output of my NSB handler’s build directory.

1
2
3
4
5
6
7
FROM windowsservercore

ADD bin/Debug /funnel

WORKDIR /funnel

ENTRYPOINT NServiceBus.Host.exe

This dockerfile is based on the windowservercore image which was loaded onto the virtual machine during the setup script. You can check that using the images command to docker. To get the docker file running I first build the image then ask for it to be run

1
2
docker -H  tcp://192.168.0.13:2376 build -t funnel1 -f .\Dockerfile .
docker -H tcp://192.168.0.13:2376 run -t -d funnel1

The final command spits out a big bunch of letters and numbers which is the Id of the image. I can use that to get access to the command line output from that image

1
docker -H  tcp://192.168.0.13:2376 logs fb0d6f23050c9039e65a106bea62a9049d9f79ce6070234472c112fed516634e

Which gets me
Output

With that I’m most of the way there. I still need to figure out some networking stuff so NSB can find my database and put our license file in there and check that NSB is actually able to talk to MSMQ and maybe find a better way to get at the logs… okay there is actually a lot still to do but this is the first step.

2016-02-18

I squash my pull requests and you should too

A couple of weeks ago I made a change to my life. It was one of those big, earth shattering changes which ripple though everything: I started to squash the commits in my pull requests.

It was a drastic change but I think an overdue one. The reasons for my change are pretty simple: it makes looking at commit histories and maintaining long-lived branches easier. Before my pull requests would contain a lot of clutter: I’d check in small bits of work when I got them working and whoever was reviewing the pull request would have to look at a bunch of commits, some of which would later be reversed, to get an idea for what was going on. By squashing the commits down to a single commit I can focus on the key parts of the pull request without people having to see the mess I generated in between.

If you have long lived branches (I know, I know) then having a smaller number of commits during rebasing is a real help. There are fewer things that need merging so you don’t end up fixing the same change over and over again.

Finally the smaller number of commits in mainline give a clearer view of what has changed in the destination branch. You individual commits might just have messages like “fixing logging” but when squashed into a PR the commit becomes “Adding new functionality to layout roads automatically”. Looking back that “fixing logging” commit isn’t at all helpful once you’re no longer in the thick of the feature.

What has it done for me?

I’ve already talked about some of the great benefits in the code base but for me, individually, there were some nicities too. First off is that I don’t have to worry about screwing up so much. If I go down some crazy path in the code which doesn’t work then I can bail out of it easily and without worrying that other developers on my team (Hi, Nick!) will think less of me.

I find myself checking in code a lot more frequently. I have a git alias to just do

1
2
3
4
5
6
7
8
9
git wip
```


and that checks in whatever I have lying around. It is a very powerful undo feature. It do a git wip whenever I find myself at the completion of some logical step be it writing a unit test or finishing some function or changing some style.

# How do you do it?

It is easy. Just work as you normally would but with the added freedoms I mentioned. When you're ready to create a pull request then you can issue

git log

1
2
3
4

and find the first commit in your branch. There are some suggested ways to do this automatically but they never seem to work for me so I just do it manually.

Now you can rebase the commits

git rebase -i 0ec9df23

1
2

where 0ec9df23 is the sha of the last commit on the parent branch. This will open up an editor showing all the commits in chronological order. On left you'll see the word pick.

pick 78fc6bc added PrDC session to speaking data
pick 9741725 Podcast with Yves Goeleven

Rebase 9d792c2..9741725 onto 9d792c2

`

Starting at the bottom just change all but the first of these to squash or simply s. Save the file and exit. Git will now chug a bit and merge all the changes into one. With this complete you can push to the central repo and issue the pull request. You can add additional changes to this PR to address comments and, just before you do the merge, do another squash. You may need to push with -f but that’s okay, this time.

I’m a big fan of this approach and I hope you will be too. It is better for your sanity, for the team’s sanity and for the git history’s sanity.

2015-12-17

SQL Server Alias

Ever run into that problem where everybody on your team is using a different database instance name and every time you check out you have to update the config file with your instance name?

Boy have I seen some complicated solutions around this involving reading from environments and having private, unversioned configuration files. One of the developers on my team recommended using SQL Server Aliases to solve the problem. I fumbled around with these for a bit because I couldn’t get them to work. Eventually, with some help, I got there.

Let’s say that you have an instance on your machine called sqlexpress but that your project needs an instance called sqlexpress2012. The first thing is to open up the SQL Server Configuration Manager. The easiest way to do this is to run

SQLServerManager13.msc

where the 13 is the version number of SQL server so SQL 2014 is 12 and SQL 2016 is 13. That will give you

SQL Server Configuration Manager

The first thing to check is that your existing instance is talking over TCP/IP.

Enable TCP/IP

Then click on the properties for TCP/IPO and in the IP Addresses tab check for the TCP Dynamic Ports setting

Dynamic ports

Make note of that number because now we’re going to jump to the alias section.
Aliases
Right click in there and add a new alias

In here we are going to set the alias name to the new name we want to use. The port number is what we found above, the protocol is TCP/IP and the server is the existing server name. You then have to repeat this for the 64 bit client configuration and then restart your SQL server. You should now be able to use the new name, localhost\sqlexpress2012 to access the server.

2015-12-16

Updating Sub-Collections with SQL Server's Merge

When you get to be as old as me then you start to see certain problems reappearing over and over again. I think this might be called “experience” but it could also be called “not getting new experiences”. It might be that instead of 10 years experience I have the same year of experience 10 times. Of course this is only true if you don’t experiment and find new ways of doing things. Even if you’re doing the same job year in and year out it is how you approach the work that determines how you will grow as a developer.

One of those problems I have encountered over the years is the problem of updating a collection of records related to one record. I’m sure you’ve encountered the same thing where you present the user with a table and let them edit, delete and add records.

A collection of rows

Now how do you get that data back to the server? You could send each row back individually using some Ajax magic. This is kind of a pain, though, you have to keep track of a lot of requests and you’re making a bunch of requests. You also need to track, behind the scenes, which rows were added and which were removed so you can send specific commands for that. It is preferable to send the whole collection at once in a single request. Now you’ve shifted the burden to the server. In the past I’ve handled this by pulling the existing collection from the database and doing painful comparisons to figure out what has changed.

There is a very useful SQL command called UPSERT which you’ll find in databases such as Postgres(assuming you’re on the cutting edge and you’re using 9.5). Upsert is basically a command which looks at the existing table data when you modify a record. If the record doesn’t exist it will be created and if it is already there the contents will be updated. This solves 2/3rds of our cases with only delete missing. Unfortunately, SQL Server doesn’t support the UPSERT command - however it does support MERGE.

I’ve always avoided MERGE because I thought it to be very complicated but in the interests of continually growing I figured it was about time that I bit the bullet and just learned how it works. I use Dapper a fair bit for talking to the database, it is just enough ORM to handle the dumb stuff while still letting me write my own SQL. It is virtually guaranteed that I write worse SQL than a full ORM but that’s a cognitive dissonance I’m prepared to let ride. By writing my own SQL I have direct access to tools like merge which might, otherwise, be missed by a beefy ORM.

The first thing to know about MERGE is that it needs to run against two tables to compare their contents. Let’s extend the example we have above of what appears to be a magic wand shop… that’s weird and totally not related to having just watched the trailer for Fantastic Beasts and Where to Find Them. Anyway our order item table looks like

1
2
3
4
create table orderItems(id uniqueidentifier,
orderId uniqueidentifier,
colorId uniqueidentifier,
quantity int)

So the first task is to create a temporary table to hold our records. By prefacing a table name with a # in SQL server we get a temporary table which is unique to our session. So other running transactions won’t see the table - exactly what we want.

1
2
3
4
5
6
7
using(var connection = GetConnection())
{
connection.Execute(@"create table #orderItems(id uniqueidentifier,
orderId uniqueidentifier,
colorId uniqueidentifier,
quantity int)");
}

Now we’ll take the items collection we have received from the web client (in my case it was via an MVC controller but I’ll leave the specifics up to you) and insert each record into the new table. Remember to do this using the same session as you used to create the table.

1
2
3
4
foreach(var item in orderItems)
{
connection.Execute("insert into #orderItems(id, orderId, colorId, quantity) values(@id, @orderId, @colorId, @quantity)", item);
}

Now the fun part: writing the merge.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
merge orderItems as target
using #orderItems as source
on target.Id = source.Id
when matched then
update set target.colorId = source.colorId,
target.quantity = soruce.quantity
when not matched by target then
insert (id,
orderId,
colorId,
quantity)
values (source.id,
source.orderId,
source.colorId,
source.quantity)
when not matched by source
and orderId = @orderId then delete;

What’s this doing? Let’s break it down. First we set a target table this is where the records will be inserted, deleted and updated. Next we set the source the place from which the records will come. In our case the temporary table. Both source and destination are aliases so really they can be whatever you want like input and output or Expecto and Patronum.

1
2
merge orderItems as target
using #orderItems as source

This line instructs on how to match. Both our tables have primary ids in a single column so we’ll use that.

1
on target.Id = source.Id

If a record is matched the we’ll update the two important target fields with the values from the source.

1
2
3
when matched then
update set target.colorId = source.colorId,
target.quantity = soruce.quantity

Next we give instructions as to what should happen if a record is missing in the target. Here we insert a record based on the temporary table.

1
2
3
4
5
6
7
8
9
when not matched by target then 
insert (id,
orderId,
colorId,
quantity)
values (source.id,
source.orderId,
source.colorId,
source.quantity)

Finally we give the instruction for what to do if the record is in the target but not in the source - we delete it.

1
2
when not matched by source 
and orderId = @orderId then delete;

In another world we might do a soft delete and simply update a field.

That’s pretty much all there is to it. MERGE has a ton of options to do more powerful operations. There is a bunch of super poorly written documentation on this on MSDN if you’re looking to learn a lot more.

2015-12-06

Copy Azure Blobs

Ever wanted to copy blobs from one Azure blob container to another? Me neither, until now. I had a bunch of files I wanted to use as part of a demo in a storage container and they needed to be moved over to a new container in a new resource group. It was 10 at night and I just wanted it solved so I briefly looked for a tool to do the copying for me. I failed to find anything. Ugh, time to write some 10pm style code, that is to say terrible code. Now you too can benefit from this. I put in some comments for fun.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Blob;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace migrateblobs
{
class Program
{
static void Main(string[] args)
{
//this is the source account
var sourceAccount = CloudStorageAccount.Parse("source connection string here");
var sourceClient = sourceAccount.CreateCloudBlobClient();
var sourceContainer = sourceClient.GetContainerReference("source container here");

//destination account
var destinationAccount = CloudStorageAccount.Parse("destination connection string here");
var destinationClient = destinationAccount.CreateCloudBlobClient();
var destinationContainer = destinationClient.GetContainerReference("destination container here");

//create the container here
destinationContainer.CreateIfNotExists();

//this token is used so the destination client can pull from the source
string blobToken = sourceContainer.GetSharedAccessSignature(
new SharedAccessBlobPolicy
{
SharedAccessExpiryTime =DateTime.UtcNow.AddYears(2),
Permissions = SharedAccessBlobPermissions.Read | SharedAccessBlobPermissions.Write
});


var srcBlobList = sourceContainer.ListBlobs(useFlatBlobListing: true, blobListingDetails: BlobListingDetails.None);
foreach (var src in srcBlobList)
{
var srcBlob = src as CloudBlob;

// Create appropriate destination blob type to match the source blob
CloudBlob destBlob;
if (srcBlob.Properties.BlobType == BlobType.BlockBlob)
{
destBlob = destinationContainer.GetBlockBlobReference(srcBlob.Name);
}
else
{
destBlob = destinationContainer.GetPageBlobReference(srcBlob.Name);
}

// copy using src blob as SAS
destBlob.StartCopyFromBlob(new Uri(srcBlob.Uri.AbsoluteUri + blobToken));
}
}
}
}

I stole some of this code from an old post here but the API has changed a bit since then so this article is a better reference. The copy operations take place asynchronously.

We’re copying between containers without copying down the the local machine so you don’t incur any egress costs unless you’re moving between data centers.

Have fun.