Recent Updates RSS Toggle Comment Threads | Keyboard Shortcuts

  • Richard 9:35 am on February 17, 2012 Permalink | Reply  

    Simple node.js socket listener 

    This simple snippet just writes back everything it receives, but it’s a useful test:

    var net = require('net');
    net.createServer(function(socket){
     socket.on('data', function(data) {
     socket.write(data);
     });
    }).listen(1337);
     
  • Richard 10:06 am on February 10, 2012 Permalink | Reply  

    Dynamic Table Storage 

    I maintain a project called AzureSugar, a library of extensions to the standard Windows Azure C# SDK which contains a number of handy methods for making the Azure Storage API easier to work with. I have just included support for using dynamic objects for table storage. As Table Storage does not enforce a fixed schema, it has always seemed a shame to me that the C# SDK forces you to use static types. Perhaps you don’t know the schema at compile time, perhaps you want to store a dictionary of items instead of a class, maybe you don’t want so much ceremony to just write a record to a table? The DynamicTableContext solves these problems.

    To use it, first, create a context object:

    var context = new DynamicTableContext("TableName", credentials); 

    Then inserting a record is easy using a dynamic object for example:

    context.Insert(new { PartitionKey="1", RowKey="1", Value1="Hello", Value2="World" }); 

    You can do the same with a dictionary:

    var dictionary = new Dictionary<string, object>();
    dictionary["PartitionKey"] = "2";
    dictionary["RowKey"] = "2";
    dictionary["Value3"] = "FooBar";
    context.Insert(dictionary); 

    Retrieving an entity is striaght forward, just pass in the values for partition key and row key:

    dynamic entity = content.Get("1", "1"); 

    You can also pass in a query:

    foreach (dynamic item in context.Query("Value1 eq 'Hello'"))
    {
      Console.WriteLine(item.RowKey);
    }

    This is the first version, there are plenty of extra features I would like to add over time, but I hope there is enough here to be useful.

    https://github.com/richorama/AzureSugar

     
  • Richard 3:41 pm on February 6, 2012 Permalink | Reply  

    SQL Azure Federations 

    SQL Azure Federation is a way of splitting a table (or tables) across a number of SQL Azure databases. The technique is also know as sharding. For more information, see this MSDN article.

    There are a few use cases for federations:

    1. Your table is too big to fit in a single database (i.e. it’s greater than 150GB)
    2. Query performance is a problem, it’s a way of breaking up the table into smaller pieces, which run on different physical hardware.
    3. In a multi-tenancy scenario, it’s a way of physically partitioning tenants into separate databases.

    Creating a federation

    Federated databases have a ‘Federation Root’ database. This is the principal database you connect to. Once connected to this, you can create a Federation with this command:

    CREATE FEDERATION federation_name (distribution_name <data_type> RANGE)

    You need to specify a ‘range’, which is a value you will use to split your data into the separate federations. Typically you would use a bigint. Federations can also be created in the management portal.

    When you create a table, you then specify which column to use for the value of the range (i.e. which column to federate on).

    USE FEDERATION federation_name(distribution_name = value) WITH RESET, FILTERING={ON|OFF}
    GO
    CREATE TABLE
        [ schema_name . ] table_ame
            ( { <column_definition> | <computed_column_definition> }
            [ < table_constraint> ] [ ,...n ] )
    FEDERATED ON (distribution_name = column_name)
    GO

    Unfortunately you can’t ‘ALTER’ a table for federation, you can only issue the command on CREATE.

    Using federation

    When you connect to the database you need to include an extra command to indicate that we want to work with federations. You also need to supply a value to indicate which of the federations to work with. One of the advantages of Federations, is that we don’t need to specify the actual federation, just the value for range. SQL Azure will determine which Federation this value falls in, and routes you accordingly.

    USE FEDERATION federation_name (distribution_name = value)
        WITH FILTERING={ON|OFF}, RESET

    The disadvantage is that this command needs to be included in your  application. The command is associated with a connection, so if there is some clever connection pooling going on (i.e. entity framework) you may have some extra work to do. You cannot call this command from within a stored procedure.

    With filtering on, only records which match the range value will be retured. With filtering off, all records from the federation can be returned.

    Splitting a federation

    Splitting federations is straight forward, you can either issue a command:

    USE FEDERATION ROOT WITH RESET
    GO
    ALTER FEDERATION federation_name SPLIT AT (boundery_value)
    GO

    …or use the management portal.

    In either case, it’s an atomic operation, requiring no down-time.

    What’s missing?

    • Identity columns are not allowed in federated tables. You can either use GUIDs (not a very nice thing to split on) or create some kind of identity issuing table in the federation root.
    • Timestamp columns are not allowed.
    • You can’t join federations back together (unless you write the code yourself!).
    • Pricing.

    Other interesting features

    • Each federation is just a separate database. You can connect to them and alter the schema yourself if you want (I’m not sure I advise this!).
    • There is a tool to help you migrate your data.

    Conclusion

    This is the first release of SQL Azure Federations, and whilst this isn’t a panacea to solve all big data problems (I’m not sure anyone has a good solution for sharding) it does present some interesting features. Separating the the layout of the federations from the concern of the application is nice, and the splitting operation is good too. It’s a shame that applications need to be modified, and include the extra ‘USE’ command, but perhaps that’s something that will change?

    I see Federations as a compromise between SQL Azure and Table Store. You give away a few features (e.g. identity and timestamps) but you gain scalability.

    Another tool in the box.

     
  • Richard 11:41 am on January 24, 2012 Permalink | Reply  

    UKGovCamp12 

    I went along the Gov Camp 2012 event on Saturday 21st Jan. I didn’t quite know what to expect, but I was there with my laptop, and an eagerness to roll up my sleeves and bash out some code.

    The day started with the ~200 delegates introducing themselves in turn. It was clear that there was some real talent in the room. We had an army of developers as well as UX people, PR and media types, and most importantly – people who understood Government. What could we accomplish? What could we build? The possibilities were endless.

    After the intros, people were invited to pitch for workshops or tasks they would like to undertake/run. Lots of ideas were put forward, and a few of them were ‘let’s build this’, ‘let’s design that’. I went for one of those workshops.

    We had some great discussions, bounced ideas around, and expert opinion and experience was injected in to that. However, we failed to really get anything done. The same for a later session, what was advertised as  ’Design and Build’ was just a talking shop. Don’t get me wrong, if other people in the session got something out of it, then it was a success, but I didn’t leave with a sense of achieving anything. I can’t help but think that if there was a little more focus on creating solutions rather than discussing problems, something real and worthwhile could have been produced.

    I did enjoying meeting some interesting people, I took away several business cards of people I intend to contact to continue conversations. I was also inspired to see that Government are taking open source very seriously and embracing technologies like GitHub to collaborate both internally and with the public.

     

     
  • Richard 2:04 pm on January 13, 2012 Permalink | Reply  

    Improving Blob Upload Speeds 

    The Challenge:

    Upload a 144 MB file into Azure Blob Storage as fast as possible.

    CloudBlob.UploadFile()

    The CloudBlob.UploadFile() method doesn’t do a bad job. Behind the scenes it’s chunking the file into blocks, and using the ‘PutBlock‘ method with the Parallel Task Library to upload them simultaneously.

    However, the Parallel Task Library will not upload all of the blocks at once, instead it uses a thread pool to upload a few at a time. This is good, but we can do better.

    All at once

    After looking through some performance metrics on blob upload, it seems that Azure storage performance gets better when you use 20-40 threads to upload your file.

    I modified my code to chunk the 144 Meg file into 36 x 4 Meg chunks, then used 36 threads to upload all the chunks simultaneously.

    The result: it’s faster (most of the time).

    Benchmarks

    My intention is to create a more complete list of benchmarks, but here’s what I’ve got at the moment.

    Location CloudBlob.UploadFile() All at once technique
    Azure Instance
    (Extra Large)
    6.1 seconds 5.3 seconds
    T1 Connection 21 seconds 14 seconds
    10 Mbps 143 seconds 615 seconds *
    Domestic ADSL Line
    (0.36 Mbps upload speed)
    Timeout 923 seconds

    (*) The network equipment is throttling the number of simultaneous outbound connections.

    Where to go from here?

    More performance gains can be made, but with added complexity. Peer networking could provide one answer. If the file exists in more that one place, perhaps uploading different parts from both locations would help?

    Another answer is compression. The particular file I used was already highly compressed. However, a file which could be heavily compressed could be uploaded to Azure compute instances in a GZipped stream, and then inflated and inserted into blob storage by a background process.

    I have tested this approach, and significant gains can be made, in direct correlation to the compressibility of the data. Obviously there is a cost implication.

     
  • Richard 1:47 pm on January 3, 2012 Permalink | Reply  

    Getting the samples to work on the Azure Node.js SDK 

    (You probably want to do this in linux)

    Install Node.js & NPM

    echo 'export PATH=$HOME/local/bin:$PATH' >> ~/.bashrc
    . ~/.bashrc
    mkdir ~/local
    mkdir ~/node-latest-install
    cd ~/node-latest-install
    curl http://nodejs.org/dist/node-latest.tar.gz | tar xz --strip-components=1
    ./configure --prefix=~/local
    make install
    curl http://npmjs.org/install.sh | sh

    Download the SDK

    npm install azure

    Download the dependencies

    cd azure-sdk-for-node
    npm install
    cd examples/blog
    npm install
    cd ../tasklist
    npm install
    cd ../../
    

    Update the storage credentials

    Update the ./lib/services/serviceclient.js with your credentials.

    The examples use the dev storage, so just update the dev storage details to be the live endpoint URIs, and your account and access key.

    Run the examples

    Start node with either of the following commands:

    cd examples/blog
    node server
    

    or

    cd examples/tasklist
    node server
    

    Then browse to http://localhost:1337/ to try the applications.

     
    • Glenn Block 9:42 am on January 6, 2012 Permalink | Reply

      Hi Richard

      Nice post!

      Instead of having to install all the modules yourself you should be able to just rely on the package.json. Go to the directory where the SDK lives (also for the samples) and just type npm install.

    • Richard 8:51 pm on January 6, 2012 Permalink | Reply

      Thanks Glenn, I have updated the post accordingly. It makes the script quite a more simple!

  • Richard 11:50 am on December 12, 2011 Permalink | Reply  

    WindowsAzure.com 

    It looks like the Windows Azure website is hosted in Azure.

    C:\Users\richard.astbury>ping http://www.windowsazure.com
    Pinging wamktg-prod-am-001.cloudapp.net [65.52.128.236] with 32 bytes of data:

    Has this always been the case?

     
  • Richard 4:03 pm on December 6, 2011 Permalink | Reply  

    Writing trace to Table Storage 

    This article provides a useful walk through on how to configure the Azure Trace Listener to write to Table Storage.

    However, this line is very interesting:

    DiagnosticMonitor.Start("DiagnosticsConnectionString");

    The diagnostics monitor actually takes no notice of the setting name you pass in (other than to ensure it’s value is a correctly formatted connection string with HTTPS enabled). It will always use this setting:

    Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString

     

     
  • Richard 7:59 am on December 1, 2011 Permalink | Reply  

    Ports in use on an Azure Worker Role 

    D:\Users\richard>netstat -a
    
    Active Connections
    
      Proto  Local Address          Foreign Address        State
      TCP    0.0.0.0:135            RD00155D362B3E:0       LISTENING
      TCP    0.0.0.0:445            RD00155D362B3E:0       LISTENING
      TCP    0.0.0.0:3389           RD00155D362B3E:0       LISTENING
      TCP    0.0.0.0:16001          RD00155D362B3E:0       LISTENING
      TCP    0.0.0.0:49152          RD00155D362B3E:0       LISTENING
      TCP    0.0.0.0:49153          RD00155D362B3E:0       LISTENING
      TCP    0.0.0.0:49154          RD00155D362B3E:0       LISTENING
      TCP    0.0.0.0:49155          RD00155D362B3E:0       LISTENING
      TCP    10.211.208.81:139      RD00155D362B3E:0       LISTENING
      TCP    10.211.208.81:3389     RD00155D362B3E:49167   ESTABLISHED
      TCP    10.211.208.81:20000    RD00155D362B3E:0       LISTENING
      TCP    10.211.208.81:20000    94.245.124.139:21287   ESTABLISHED
      TCP    10.211.208.81:49156    10.211.209.20:http     ESTABLISHED
      TCP    10.211.208.81:49164    10.211.209.20:http     ESTABLISHED
      TCP    10.211.208.81:49167    RD00155D362B3E:ms-wbt-server  ESTABLISHED
      TCP    10.211.208.81:49174    blob:https             ESTABLISHED
      TCP    [::]:135               RD00155D362B3E:0       LISTENING
      TCP    [::]:445               RD00155D362B3E:0       LISTENING
      TCP    [::]:3389              RD00155D362B3E:0       LISTENING
      TCP    [::]:16001             RD00155D362B3E:0       LISTENING
      TCP    [::]:49152             RD00155D362B3E:0       LISTENING
      TCP    [::]:49153             RD00155D362B3E:0       LISTENING
      TCP    [::]:49154             RD00155D362B3E:0       LISTENING
      TCP    [::]:49155             RD00155D362B3E:0       LISTENING
      UDP    0.0.0.0:123            *:*
      UDP    0.0.0.0:500            *:*
      UDP    0.0.0.0:4500           *:*
      UDP    0.0.0.0:5355           *:*
      UDP    10.211.208.81:137      *:*
      UDP    10.211.208.81:138      *:*
      UDP    [::]:123               *:*
      UDP    [::]:500               *:*
      UDP    [::]:5355              *:*
      UDP    [fe80::988:6019:94b7:912%15]:546  *:*

     

     
  • Richard 9:09 pm on November 29, 2011 Permalink | Reply  

    CPU-Z on an Azure Compute Instance 

    Taken from a small instance running in the North Europe (Dublin) data centre.

     
c
compose new post
j
next post/next comment
k
previous post/previous comment
r
reply
e
edit
o
show/hide comments
t
go to top
l
go to login
h
show/hide help
shift + esc
cancel
Follow

Get every new post delivered to your Inbox.

Join 95 other followers