<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Code as Craft</title>
	<atom:link href="http://codeascraft.etsy.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://codeascraft.etsy.com</link>
	<description>Just another WordPress.com site</description>
	<lastBuildDate>Wed, 16 May 2012 16:44:10 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='codeascraft.etsy.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://1.gravatar.com/blavatar/b137a7ea326a4fb36bf330a38c37d963?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>Code as Craft</title>
		<link>http://codeascraft.etsy.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://codeascraft.etsy.com/osd.xml" title="Code as Craft" />
	<atom:link rel='hub' href='http://codeascraft.etsy.com/?pushpress=hub'/>
		<item>
		<title>Two Sides For Salvation</title>
		<link>http://codeascraft.etsy.com/2012/04/20/two-sides-for-salvation/</link>
		<comments>http://codeascraft.etsy.com/2012/04/20/two-sides-for-salvation/#comments</comments>
		<pubDate>Fri, 20 Apr 2012 19:52:19 +0000</pubDate>
		<dc:creator>akachler</dc:creator>
				<category><![CDATA[databases]]></category>
		<category><![CDATA[infrastructure]]></category>
		<category><![CDATA[operations]]></category>

		<guid isPermaLink="false">http://codeascraft.etsy.com/?p=2124</guid>
		<description><![CDATA[How do you make changes to your database’s structure that’s getting hammered 24&#215;7 without any disruption? If you use Oracle and paid millions for it, it’s built in. If you use Mysql, it’s one of the holy grails of database operations, and one we’ve learned to do here at Etsy. We have a sharded architecture, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=2124&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>How do you make changes to your database’s structure that’s getting hammered 24&#215;7 without any disruption? If you use Oracle and paid millions for it, it’s built in. If you use Mysql, it’s one of the holy grails of database operations, and one we’ve learned to do here at Etsy.</p>
<p>We have a sharded architecture, which means data is scattered across several “shards”. Each shard has different data than all others. Each shard is a master-master pair. MM pairs are masters and slaves at the same time. They not only give you fault tolerance, they divide the read and write load between them that’s impossible to do in the common master-slave(s) setup. MM pairs have their own set of challenges.</p>
<p><strong>Don’t Let Your Database Generate Anything</strong></p>
<p>The main problem with MM pairs is caused by non-deterministic values generated by the database engine itself, such as autoincrement fields, random numbers and timestamps. The solution to that is that <span style="text-decoration:underline;">we don’t let the db generate anything</span>. Every value inserted/modified always comes from the application. This allows us to write to either of the two sides of a MM pair knowing it will get replicated to the other side correctly. I’ve heard that MM pairs don’t make sense since you’re executing everything twice. It’s true that you are executing everything twice, but you’re doing it already if you’re using a master-slave(s) setup, and the benefits that come from MM pairs are huge. In addition to giving you fault tolerance and load balancing, they are the key to being able to do non-disruptive, live schema changes.</p>
<p>The other part of the puzzle is our shard-aware software layer: our in-house built ORM. It does many different things, but for our current topic, it finds where in our shards a particular object’s data lives. Whenever we need to access the data for an object, the ORM first goes to one of two “index” servers we have, then go to the shard that has the needed data. These index servers are also a MM pair. Index servers get a very large amount of queries, but they are all extremely fast, all in the order of 10-100𝜇s. It’s common for sharded architectures not to have an index server. You simply decide on a sharding scheme when you start, say by user id, then divide the data among your shards knowing where ranges of users live. Everything works great until the number of users on a shard grows beyond what one shard can handle, and by then you’re already in trouble. By having an index server, we can move data between shards and simply update the index to point to the new location.</p>
<p>Our ORM reads a configuration file when it starts, that among other things, contains the list of shard servers available to it. We can add shards as needed with time and add them to the configuration file to start writing data to them, also migrating users so new shards are not idle at first and to balance the load among all shards.</p>
<p>The kicker: when we do schema changes, we take out one server from each of the MM pairs from the configuration file and gracefully restart the application. The ORM re-reads its configuration and knows only about the active shard sides. This leaves the application running on half of our database servers. <span style="text-decoration:underline;">Nobody notices</span>. We immediately see in our many graphs that one side’s traffic plummets and the other side is taking all the load.</p>
<p><a href="http://etsycodeascraft.files.wordpress.com/2012/04/sides_out_15.png"><img class="aligncenter size-full wp-image-2138" title="sides_out_1" src="http://etsycodeascraft.files.wordpress.com/2012/04/sides_out_15.png" alt="" width="500" height="114" /></a></p>
<p>Note that half of the servers does not mean half of the data. All data lives on both sides of a MM pair. Replication is still going both ways, we never break it. The active side simply stops getting inserts/updates/deletes from the inactive side because nothing is connecting to it. But the inactive side still gets inserts/updates/deletes from the active side since it’s still a slave. We could break replication for the ALTERs, but there’s no benefit in doing so and adds an unnecessary step (with the one exception of the session we’re actively doing ALTERs in. We don’t want those to replicate.)<br />
At this point we are ready to make as many changes as we need on the inactive side. In Mysql terms, ALTERs. These ALTERs can take anywhere from minutes to hours to complete and lock the tables they are modifying, but we’re operating on the inactive side and definitely don’t want any of our work to replicate to the other side, so we prepend ALTERs with SET SQL_LOG_BIN=0.<br />
When these alters are done, they have been applied to the inactive side only.<br />
Another change in the config file places these servers back into active mode. We wait for load to stabilize between both sides, replication to catch up if it has lagged behind, then we’re ready to repeat for the side that hasn’t been ALTER’ed.</p>
<p>Taking sides out of production is not only useful for schema changes, but for upgrades, configuration changes, and any other necessary downtime.</p>
<p>So this is all great, works well for us. We routinely do schema changes with no user impact. But what if you don’t have an ORM? Mysql Proxy may be your answer. It’s very simple to have web servers connect to a pool of available backend database servers with Mysql Proxy. You can read the documentation for it at Mysql’s website. An important feature of Mysql Proxy is that it allows you to change configuration on-the-fly, so you can take servers in and out without even having to stop or reload your application.</p>
<p>MM pairs have had a bad reputation of being quirky. They can be, but as long as you don’t let your database generate anything, they work. When you need to do frequent schema changes in a 24&#215;7 environment, they are key to no-downtime schema changes.</p>
<p>If you want more details on our database architecture, you can also check <a title="Etsy Shard Architecture" href="http://www.slideshare.net/jgoulah/the-etsy-shard-architecture-starts-with-s-and-ends-with-hard">here</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/etsycodeascraft.wordpress.com/2124/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/etsycodeascraft.wordpress.com/2124/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/etsycodeascraft.wordpress.com/2124/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/etsycodeascraft.wordpress.com/2124/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/etsycodeascraft.wordpress.com/2124/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/etsycodeascraft.wordpress.com/2124/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/etsycodeascraft.wordpress.com/2124/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/etsycodeascraft.wordpress.com/2124/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/etsycodeascraft.wordpress.com/2124/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/etsycodeascraft.wordpress.com/2124/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/etsycodeascraft.wordpress.com/2124/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/etsycodeascraft.wordpress.com/2124/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/etsycodeascraft.wordpress.com/2124/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/etsycodeascraft.wordpress.com/2124/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=2124&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://codeascraft.etsy.com/2012/04/20/two-sides-for-salvation/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/aa35c4ea9811861bdcfc0b8da5a8ab02?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">akachler</media:title>
		</media:content>

		<media:content url="http://etsycodeascraft.files.wordpress.com/2012/04/sides_out_15.png" medium="image">
			<media:title type="html">sides_out_1</media:title>
		</media:content>
	</item>
		<item>
		<title>Etsy Hacker Grants: Supporting Women in Technology</title>
		<link>http://codeascraft.etsy.com/2012/04/05/etsy-hacker-grants/</link>
		<comments>http://codeascraft.etsy.com/2012/04/05/etsy-hacker-grants/#comments</comments>
		<pubDate>Thu, 05 Apr 2012 20:54:14 +0000</pubDate>
		<dc:creator>Kellan Elliott-McCrea</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://codeascraft.etsy.com/?p=2116</guid>
		<description><![CDATA[&#8220;Today, in conjunction with Hacker School, Etsy is announcing a new scholarship and sponsorship program for women in technology: we’ll be hosting the summer 2012 session of Hacker School in the Etsy headquarters, and we’re providing ten Etsy Hacker Grants of $5,000 each — a total of $50,000 — to women who want to join [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=2116&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<blockquote><p>&#8220;Today, in conjunction with Hacker School, Etsy is announcing a new scholarship and sponsorship program for women in technology: we’ll be hosting the summer 2012 session of Hacker School in the Etsy headquarters, and we’re providing ten Etsy Hacker Grants of $5,000 each — a total of $50,000 — to women who want to join but need financial support to do so. Our goal is to bring 20 women to New York to participate, and we hope this will be the first of many steps to encourage more women into engineering at Etsy and across the industry.&#8221; &#8211; <a href="http://www.etsy.com/blog/news/2012/etsy-hacker-grants-supporting-women-in-technology/">find out more</a>, and then checkout <a href="http://etsy.com/hacker-grants">Etsy Hacker Grants</a></p></blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/etsycodeascraft.wordpress.com/2116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/etsycodeascraft.wordpress.com/2116/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/etsycodeascraft.wordpress.com/2116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/etsycodeascraft.wordpress.com/2116/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/etsycodeascraft.wordpress.com/2116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/etsycodeascraft.wordpress.com/2116/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/etsycodeascraft.wordpress.com/2116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/etsycodeascraft.wordpress.com/2116/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/etsycodeascraft.wordpress.com/2116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/etsycodeascraft.wordpress.com/2116/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/etsycodeascraft.wordpress.com/2116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/etsycodeascraft.wordpress.com/2116/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/etsycodeascraft.wordpress.com/2116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/etsycodeascraft.wordpress.com/2116/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=2116&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://codeascraft.etsy.com/2012/04/05/etsy-hacker-grants/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/01457d1a0f0e533062cd0d1033fb4d7a?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">kellan</media:title>
		</media:content>
	</item>
		<item>
		<title>Kernel Debugging 101</title>
		<link>http://codeascraft.etsy.com/2012/03/30/kernel-debugging-101/</link>
		<comments>http://codeascraft.etsy.com/2012/03/30/kernel-debugging-101/#comments</comments>
		<pubDate>Fri, 30 Mar 2012 15:21:59 +0000</pubDate>
		<dc:creator>avleenetsy</dc:creator>
				<category><![CDATA[engineering]]></category>
		<category><![CDATA[operations]]></category>

		<guid isPermaLink="false">http://codeascraft.etsy.com/?p=2091</guid>
		<description><![CDATA[A dark fog had been rolling in that night, and we had been setting up a new cluster of servers for our CI system. CentOS 6.2, LXC and random kernel panics were all there to lend a hand. The kernel panics were new to our party, having been absent at the previous cluster setup. The [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=2091&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>A dark fog had been rolling in that night, and we had been setting up a new cluster of servers for our <a href="http://codeascraft.etsy.com/2011/04/20/divide-and-concur/">CI system</a>. CentOS 6.2, <a href="http://en.wikipedia.org/wiki/LXC">LXC</a> and random kernel panics were all there to lend a hand. The kernel panics were new to our party, having been absent at the previous cluster setup.</p>
<p>The first set of servers we installed had been running happily, however this new set were not. They would always kernel panic under the slightest load, and sometimes without much load at all.</p>
<p>A nice feature, which is enabled by default in CentOS 6, allows your kernel to dump a core file when it panics. When your system comes back, you can retrieve this core file, examine it and work out what happened. This is a short story on how we did it. The work we did is on CentOS but it can easily be applied to other Linux distributions too.</p>
<h2>Basics and theory</h2>
<p>When a kernel panics, it dumps a core file into /var/crash/, which can then be examined.<br />
In your tool belt, you need:</p>
<ul>
<li><tt>/usr/bin/crash</tt> (installed through the &#8220;crash&#8221; package)</li>
<li>A debuginfo kernel (downloaded and installed from <a href="http://debuginfo.centos.org/6/x86_64/" rel="nofollow">http://debuginfo.centos.org/6/x86_64/</a>)</li>
<li>The <tt>vmcore</tt> file from <tt>/var/crash/....</tt></li>
</ul>
<p><tt>crash</tt> is gdb-like. It uses <tt>gdb</tt> and lets you examine <tt>vmcore</tt> files from kernels.</p>
<h2><a name="Kerneldebugging101-Basicdebugging"></a>Basic debugging</h2>
<p>Start <tt>crash</tt> by pointing it at the <tt>vmlinux</tt> file installed by the debuginfo kernel, and the <tt>vmcore</tt> file:</p>
<div>
<pre style="font-size:11px;">sudo crash /usr/lib/debug/lib/modules/`uname -r`/vmlinux \
/var/crash/&lt;time&gt;/vmcore</pre>
</div>
<p>You will see output like this:</p>
<div>
<div>
<div>
<pre style="padding-left:30px;font-size:11px;">      KERNEL: /usr/lib/debug/lib/modules/&lt;kernel&gt;/vmlinux
    DUMPFILE: /var/crash/127.0.0.1-2012-03-28-23:51:01/vmcore
  [PARTIAL DUMP]
        CPUS: 16
        DATE: Wed Mar 28 23:50:56 2012
      UPTIME: 00:23:26
LOAD AVERAGE: 0.95, 1.45, 1.01
       TASKS: 986
    NODENAME: buildtest07
     RELEASE: 2.6.32-220.7.1.el6.x86_64
     VERSION: #1 SMP Wed Mar 7 00:52:02 GMT 2012
     MACHINE: x86_64  (2400 Mhz)
      MEMORY: 24 GB
       PANIC: "Oops: 0000 [#1] SMP " (check log for details)
         PID: 0
     COMMAND: "swapper"
        TASK: ffff880337eb9580  (1 of 16)
[THREAD_INFO: ffff880637d18000]
         CPU: 12
       STATE: TASK_RUNNING (PANIC)</pre>
</div>
</div>
</div>
<p>This tells us some important bits of information:</p>
<ul>
<li>The command being run was <tt>swapper</tt>. <tt>swapper</tt> is a kernel process responsible for scheduling time on the CPU. When a panic happens here, there&#8217;s a likelihood that we&#8217;re looking at something in the kernel space that broke, rather than something in user space.</li>
<li>The panic was an <tt>Oops</tt>.</li>
<li>There are dates, times, number of running processes and other handy information too.</li>
</ul>
<p>If you now run:</p>
<div>
<div>
<div>
<pre style="font-size:11px;">crash&gt; log</pre>
</div>
</div>
</div>
<p>and jump to the end, you will see much longer output like this:</p>
<div>
<div>
<div>
<pre style="padding-left:30px;font-size:11px;">BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
IP: [&lt;ffffffff8142bb40&gt;] __netif_receive_skb+0x60/0x6e0
PGD 10e0fe067 PUD 10e0b0067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/virtual/block/dm-6/removable
CPU 12
Modules linked in: veth bridge stp llc e1000e serio_raw
                   i2c_i801 i2c_core sg iTCO_wdt
                   iTCO_vendor_support ioatdma dca i7core_edac
                   edac_core shpchp ext3 jbd mbcache sd_mod
                   crc_t10dif ahci dm_mirror dm_region_hash
                   dm_log dm_mod [last unloaded: scsi_wait_scan]</pre>
<pre style="padding-left:30px;font-size:11px;">Pid: 0, comm: swapper Not tainted &lt;kernel&gt; #1 Supermicro X8DTT-H/X8DTT-H
RIP: 0010:[&lt;ffffffff8142bb40&gt;]  [&lt;ffffffff8142bb40&gt;] __netif_receive_skb+0x60/0x6e0
RSP: 0018:ffff88034ac83dc0  EFLAGS: 00010246
RAX: 0000000000000060 RBX: ffff8805353896c0 RCX: 0000000000000000
RDX: ffff88053e8c3380 RSI: 0000000000000286 RDI: ffff8805353896c0
RBP: ffff88034ac83e10 R08: 00000000000000c3 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000015 R14: ffff88034ac93770 R15: ffff88034ac93784
FS:  0000000000000000(0000) GS:ffff88034ac80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000060 CR3: 000000010e130000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffff880637d18000,
task ffff880337eb9580)
Stack:
 ffffc90013e37000 ffff880334bdc868 ffff88034ac83df0 0000000000000000
&lt;0&gt; ffff880334bdc868 ffff88034ac93788 ffff88034ac93700 0000000000000015
&lt;0&gt; ffff88034ac93770 ffff88034ac93784 ffff88034ac83e60 ffffffff8142c25a
Call Trace:
 &lt;IRQ&gt;
 [&lt;ffffffff8142c25a&gt;] process_backlog+0x9a/0x100
 [&lt;ffffffff814308d3&gt;] net_rx_action+0x103/0x2f0
 [&lt;ffffffff81072001&gt;] __do_softirq+0xc1/0x1d0
 [&lt;ffffffff810d94a0&gt;] ? handle_IRQ_event+0x60/0x170
 [&lt;ffffffff8100c24c&gt;] call_softirq+0x1c/0x30
 [&lt;ffffffff8100de85&gt;] do_softirq+0x65/0xa0
 [&lt;ffffffff81071de5&gt;] irq_exit+0x85/0x90
 [&lt;ffffffff814f4dc5&gt;] do_IRQ+0x75/0xf0
 [&lt;ffffffff8100ba53&gt;] ret_from_intr+0x0/0x11
 &lt;EOI&gt;
 [&lt;ffffffff812c4b0e&gt;] ? intel_idle+0xde/0x170
 [&lt;ffffffff812c4af1&gt;] ? intel_idle+0xc1/0x170
 [&lt;ffffffff813fa027&gt;] cpuidle_idle_call+0xa7/0x140
 [&lt;ffffffff81009e06&gt;] cpu_idle+0xb6/0x110
 [&lt;ffffffff814e5ffc&gt;] start_secondary+0x202/0x245
Code: 00 44 8b 1d cb be 79 00 45 85 db 0f 85 61 06 00 00 f6 83 b9
      00 00 00 10 0f 85 5d 04 00 00 4c 8b 63 20 4c 89 65 c8 49 8d
      44 24 60 &lt;49&gt; 39 44 24 60 74 44 4d 8b ac 24 00 04 00 00 4d
      85 ed 74 37 49
RIP  [&lt;ffffffff8142bb40&gt;] __netif_receive_skb+0x60/0x6e0
 RSP &lt;ffff88034ac83dc0&gt;
CR2: 0000000000000060</pre>
</div>
</div>
</div>
<p>At this level the most interesting thing to note is the line starting with <tt>BUG</tt>:</p>
<div>
<div>
<div>
<pre style="padding-left:30px;font-size:11px;">BUG: unable to handle kernel NULL pointer dereference at 0000000000000058</pre>
</div>
</div>
</div>
<p>This tells us why the panic happened: A NULL pointer dereference.</p>
<h2><a name="Kerneldebugging101-Intermediatedebugging"></a>Intermediate debugging</h2>
<p>Now some more gnarly stuff. If you&#8217;re still here, you&#8217;re pretty damn brave.<br />
There some cool things we can do in <tt>crash</tt> which might help us.<br />
For example, we can look at the list of running processes:</p>
<div>
<div>
<div>
<pre style="padding-left:30px;font-size:11px;">   PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
      0      0   0  ffffffff81a8d020  RU   0.0       0      0  [swapper]
&gt;     0      0   1  ffff880638628a80  RU   0.0       0      0  [swapper]
&gt;     0      0   2  ffff880337d934c0  RU   0.0       0      0  [swapper]
&gt;     0      0   3  ffff8806386294c0  RU   0.0       0      0  [swapper]
&gt;     0      0   4  ffff880337dd3580  RU   0.0       0      0  [swapper]
&gt;     0      0   5  ffff880637c84080  RU   0.0       0      0  [swapper]
&gt;     0      0   6  ffff880337df1540  RU   0.0       0      0  [swapper]
&gt;     0      0   7  ffff880637c84ac0  RU   0.0       0      0  [swapper]
&gt;     0      0   8  ffff880337e33500  RU   0.0       0      0  [swapper]
      0      0   9  ffff880637c85500  RU   0.0       0      0  [swapper]
&gt;     0      0  10  ffff880337e774c0  RU   0.0       0      0  [swapper]
&gt;     0      0  11  ffff880637cf40c0  RU   0.0       0      0  [swapper]
&gt;     0      0  12  ffff880337eb9580  RU   0.0       0      0  [swapper]
&gt;     0      0  13  ffff880637cf4b00  RU   0.0       0      0  [swapper]
&gt;     0      0  14  ffff880337ed7540  RU   0.0       0      0  [swapper]
&gt;     0      0  15  ffff880637cf5540  RU   0.0       0      0  [swapper]
      1      0   8  ffff880638628040  IN   0.0   21364   1568  init
      2      0   8  ffff880337c714c0  IN   0.0       0      0  [kthreadd]
      3      2   0  ffff880337c70a80  IN   0.0       0      0  [migration/0]
      4      2   0  ffff880337c70040  IN   0.0       0      0  [ksoftirqd/0]
      5      2   0  ffff880337c99500  IN   0.0       0      0  [migration/0]
....</pre>
</div>
</div>
</div>
<p>The list continues on for another 950 lines. The lines starting with a <tt>&gt;</tt> indicate the processes currently active on a CPU. We are working on a 16 core system, and 14 of them are currently in kernel mode tending to CPU scheduling duties. (The other 2 further down were <tt>gmond</tt> and <tt>php</tt>.)</p>
<p>You can also change the context you&#8217;re currently in. For example, currently we&#8217;re looking at the CPU where the kernel panic happened. However, we can switch to another CPU if we wanted to using <tt>set -c &lt;CPU&gt;</tt> and examine things there. (We don&#8217;t want to do this yet.)</p>
<p>Speaking of our kernel panic! Let&#8217;s get back to that..<br />
The code at the end of the previous section tells us a null pointer dereference was the cause of the crash, and it gives us some clues as to what happened:</p>
<div>
<div>
<div>
<pre style="padding-left:30px;font-size:11px;">IP: [&lt;ffffffff8142bb40&gt;] __netif_receive_skb+0x60/0x6e0</pre>
</div>
</div>
</div>
<p>Here&#8217;s the code that was being executed! <tt>__netif_receive_skb</tt>, this sounds like it might have something to do with receiving things on the network interface, if I had to guess! (and if I did, I&#8217;d be right &#8211; many things in the kernel are named to be obvious).</p>
<p>We could even grep through the kernel source to find this!</p>
<div>
<div>
<div>
<pre style="padding-left:30px;font-size:11px;">shell&gt; ack -a __netif_receive_skb
net/core/dev.c
    2705:int __netif_receive_skb(struct sk_buff *skb)</pre>
</div>
</div>
</div>
<p>Looking at line <tt>2705</tt> of <tt>net/core/dev.c</tt> shows us the method that was running when things broke down.</p>
<h2><a name="Kerneldebugging101-Advanceddebugging"></a>Advanced debugging</h2>
<p>Are you really still reading this? You&#8217;ve just gone from brave, to foolish! There is no turning back! Here&#8217;s the rabbit hole:</p>
<p>There&#8217;s a lot more information here! Just because we know the method that was invoked doesn&#8217;t help us that much. The method could be 2 lines and the solution could be clear, but more likely the method will be 100 lines long and make many references to other things. So let&#8217;s start poking at the memory to see what specifically happened:</p>
<div>
<div>
<div>
<pre style="padding-left:30px;font-size:11px;">IP: [&lt;ffffffff8142bb40&gt;] __netif_receive_skb+0x60/0x6e0</pre>
</div>
</div>
</div>
<p>There are two each bits of information in this line:</p>
<ol>
<li><tt>ffffffff8142bb40</tt> is the memory address at which the problem occurred.</li>
<li><tt>0x60</tt> is the hex offset for the line in the memory that caused the problem. <tt>0x60</tt> is <tt>96</tt> in decimal &#8211; remember this!<br />
We can inspect the memory by doing this:</p>
<div>
<div>
<div>
<pre style="font-size:11px;">crash&gt; dis -rl ffffffff8142bb40
/usr/src/debug/kernel-2.6.32-220.7.1.el6/.../net/core/dev.c: 2706
0xffffffff8142bae0 &lt;__netif_receive_skb&gt;:       push   %rbp
0xffffffff8142bae1 &lt;__netif_receive_skb+1&gt;:     mov    %rsp,%rbp
0xffffffff8142bae4 &lt;__netif_receive_skb+4&gt;:     push   %r15
0xffffffff8142bae6 &lt;__netif_receive_skb+6&gt;:     push   %r14
0xffffffff8142bae8 &lt;__netif_receive_skb+8&gt;:     push   %r13
0xffffffff8142baea &lt;__netif_receive_skb+10&gt;:    push   %r12
0xffffffff8142baec &lt;__netif_receive_skb+12&gt;:    push   %rbx
0xffffffff8142baed &lt;__netif_receive_skb+13&gt;:    sub    $0x28,%rsp
0xffffffff8142baf1 &lt;__netif_receive_skb+17&gt;:    nopl   0x0(%rax,%rax,1)
/usr/src/debug/kernel-2.6.32-220.7.1.el6/.../net/core/dev.c: 2715
0xffffffff8142baf6 &lt;__netif_receive_skb+22&gt;:    cmpq   $0x0,0x18(%rdi)
/usr/src/debug/kernel-2.6.32-220.7.1.el6/.../net/core/dev.c: 2706
0xffffffff8142bafb &lt;__netif_receive_skb+27&gt;:    mov    %rdi,%rbx
/usr/src/debug/kernel-2.6.32-220.7.1.el6/.../net/core/dev.c: 2715
0xffffffff8142bafe &lt;__netif_receive_skb+30&gt;:    jne    0xffffffff8142bb16 &lt;__netif_receive_skb+54&gt;
/usr/src/debug/kernel-2.6.32-220.7.1.el6/.../arch/x86/include/asm/atomic_64.h: 23
0xffffffff8142bb00 &lt;__netif_receive_skb+32&gt;:    mov    0xbdcbae(%rip),%eax        # 0xffffffff820086b4
/usr/src/debug/kernel-2.6.32-220.7.1.el6/.../net/core/dev.c: 1364
0xffffffff8142bb06 &lt;__netif_receive_skb+38&gt;:    test   %eax,%eax
0xffffffff8142bb08 &lt;__netif_receive_skb+40&gt;:    jne    0xffffffff8142c008 &lt;__netif_receive_skb+1320&gt;
/usr/src/debug/kernel-2.6.32-220.7.1.el6/.../net/core/dev.c: 1367
0xffffffff8142bb0e &lt;__netif_receive_skb+46&gt;:    movq   $0x0,0x18(%rdi)
/usr/src/debug/kernel-2.6.32-220.7.1.el6/.../include/trace/events/net.h: 68
0xffffffff8142bb16 &lt;__netif_receive_skb+54&gt;:    mov    0x79becb(%rip),%r11d        # 0xffffffff81bc79e8
0xffffffff8142bb1d &lt;__netif_receive_skb+61&gt;:    test   %r11d,%r11d
0xffffffff8142bb20 &lt;__netif_receive_skb+64&gt;:    jne    0xffffffff8142c187 &lt;__netif_receive_skb+1703&gt;
/usr/src/debug/kernel-2.6.32-220.7.1.el6/.../net/core/dev.c: 2719
0xffffffff8142bb26 &lt;__netif_receive_skb+70&gt;:    testb  $0x10,0xb9(%rbx)
0xffffffff8142bb2d &lt;__netif_receive_skb+77&gt;:    jne    0xffffffff8142bf90 &lt;__netif_receive_skb+1200&gt;
/usr/src/debug/kernel-2.6.32-220.7.1.el6/.../include/linux/netpoll.h: 86
0xffffffff8142bb33 &lt;__netif_receive_skb+83&gt;:    mov    0x20(%rbx),%r12
0xffffffff8142bb37 &lt;__netif_receive_skb+87&gt;:    mov    %r12,-0x38(%rbp)
0xffffffff8142bb3b &lt;__netif_receive_skb+91&gt;:    lea    0x60(%r12),%rax
0xffffffff8142bb40 &lt;__netif_receive_skb+96&gt;:    cmp    %rax,0x60(%r12)</pre>
</div>
</div>
</div>
</li>
</ol>
<p>WOW! Lots of output!</p>
<p>What you&#8217;re looking at is the full list of steps that took place, starting with the method that was invoked and ending with the fault.<br />
If you look at the last line of the output, you should recognise two key things:</p>
<div>
<div>
<div>
<pre style="padding-left:30px;font-size:11px;">0xffffffff8142bb40 &lt;__netif_receive_skb+96&gt;:    cmp    %rax,0x60(%r12)</pre>
</div>
</div>
</div>
<ol>
<li>The memory address is the one we peeked</li>
<li>The number 96 is the decimal offset in the memory where the fault occurred.<br />
A few lines above this, we see a new filename and line number mentioned, <tt>include/linux/netpoll.h:86</tt>.<br />
Congratulations! This is the <strong>actual</strong> line which caused the NULL pointer dereference!</li>
</ol>
<p>But what is it actually DOING?</p>
<div>
<div>
<div>
<pre style="padding-left:30px;font-size:11px;">cmp    %rax,0x60(%r12)</pre>
</div>
</div>
</div>
<p>Let&#8217;s examine this bit. In assembly, <tt>cmp</tt> is called to compare two registers. <tt>%rax</tt> and <tt>%r12</tt> are two CPU registers which we want to compare. If we got back to our <tt>log</tt> output, we see the registers:</p>
<div>
<div>
<div>
<pre style="padding-left:30px;font-size:11px;">RAX: 0000000000000060 RBX: ffff8805353896c0 RCX: 0000000000000000
RDX: ffff88053e8c3380 RSI: 0000000000000286 RDI: ffff8805353896c0
RBP: ffff88034ac83e10 R08: 00000000000000c3 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000</pre>
</div>
</div>
</div>
<p><tt>RAX</tt> has some data in it, but <tt>R12</tt> is NULL!</p>
<p>Now take a look at code that was mentioned earlier:</p>
<div>
<div>
<div>
<pre style="padding-left:30px;font-size:11px;"> 84 static inline int netpoll_receive_skb(struct sk_buff *skb)
 85 {
 86         if (!list_empty(&amp;skb-&gt;dev-&gt;napi_list))
 87                 return netpoll_rx(skb);
 88         return 0;
 89 }</pre>
</div>
</div>
</div>
<p>If you know linux networking internals, a few things can be understood from this:</p>
<ol>
<li>We&#8217;re dealing with New API (NAPI) code, based on the variable name <tt>napi_list</tt></li>
<li>NAPI is invoked when the system thinks there are a large number of interrupts being handled, and changing to RX POLLING would be more efficient</li>
<li><tt>sk_buff</tt> is the struct that holds data about linux sockets</li>
<li>We&#8217;re checking if the list, <tt>&amp;skb-&gt;dev-&gt;napi_list</tt> is empty, so we&#8217;re expecting it to be defined, and be of the right type.</li>
</ol>
<p>What is probably going on here, is that expecting <tt>&amp;skb-&gt;dev-&gt;napi_list</tt> to hold a list of incoming packets that need to be processed. Unfortunately, it turns out that something, most likely <tt>napi_list</tt>, is not set, causing the NULL pointed dereference.</p>
<p>Congratulations! You&#8217;ve found the problem. Finding what the cause of this is, however, will be done in a future tutorial.<br />
In the meantime we have some options for workarounds:</p>
<ol>
<li>Try and test if <tt>napi_list</tt> is NULL just before the <tt>!list_empty</tt> check, and define it. This might have other unforeseen issues though &#8211; what if it causes a memory leak? It&#8217;s certainly not fixing the <em>cause</em> of the problem.</li>
<li>Disable the use of NAPI by recompiling the network driver. This is a pretty simple operation and probably worth trying, especially if the system isn&#8217;t massively CPU bound and dropping packets.</li>
</ol>
<p>We ended up going with option 2, as it was the fastest and most likely fix.</p>
<p>But we&#8217;ll still be sleeping with one eye open&#8230;</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/etsycodeascraft.wordpress.com/2091/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/etsycodeascraft.wordpress.com/2091/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/etsycodeascraft.wordpress.com/2091/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/etsycodeascraft.wordpress.com/2091/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/etsycodeascraft.wordpress.com/2091/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/etsycodeascraft.wordpress.com/2091/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/etsycodeascraft.wordpress.com/2091/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/etsycodeascraft.wordpress.com/2091/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/etsycodeascraft.wordpress.com/2091/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/etsycodeascraft.wordpress.com/2091/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/etsycodeascraft.wordpress.com/2091/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/etsycodeascraft.wordpress.com/2091/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/etsycodeascraft.wordpress.com/2091/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/etsycodeascraft.wordpress.com/2091/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=2091&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://codeascraft.etsy.com/2012/03/30/kernel-debugging-101/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/67cb1db413ac8e2d47a1536b9fb88e48?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">avleenetsy</media:title>
		</media:content>
	</item>
		<item>
		<title>Come have drinks with Etsy and Basho</title>
		<link>http://codeascraft.etsy.com/2012/03/16/come-have-drinks-with-etsy-and-basho/</link>
		<comments>http://codeascraft.etsy.com/2012/03/16/come-have-drinks-with-etsy-and-basho/#comments</comments>
		<pubDate>Fri, 16 Mar 2012 16:26:29 +0000</pubDate>
		<dc:creator>John Goulah</dc:creator>
				<category><![CDATA[events]]></category>

		<guid isPermaLink="false">http://codeascraft.etsy.com/?p=2080</guid>
		<description><![CDATA[We&#8217;ll be doing a drinkup next week with our friends at Basho, the folks that invented Riak. We&#8217;ll be hanging out a ReBar in downtown Dumbo on Tuesday, March 20th. Come have a drink with us! Some more details here: http://basho.com/blog/technical/2012/03/16/Drinkup-with-Etsy-on-March-20/<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=2080&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>We&#8217;ll be doing a drinkup next week with our friends at <a href="http://basho.com/" title="basho" target="_blank">Basho</a>,  the folks that invented <a href="http://wiki.basho.com/Riak.html" title="Riak" target="_blank">Riak</a>.   We&#8217;ll be hanging out a <a href="http://rebarnyc.com/" title="ReBar" target="_blank">ReBar</a> in downtown Dumbo on Tuesday, March 20th.  Come have a drink with us!</p>
<p>Some more details here:<br />
<a href="http://basho.com/blog/technical/2012/03/16/Drinkup-with-Etsy-on-March-20/" title="etsy/basho drinkup" target="_blank">http://basho.com/blog/technical/2012/03/16/Drinkup-with-Etsy-on-March-20/</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/etsycodeascraft.wordpress.com/2080/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/etsycodeascraft.wordpress.com/2080/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/etsycodeascraft.wordpress.com/2080/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/etsycodeascraft.wordpress.com/2080/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/etsycodeascraft.wordpress.com/2080/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/etsycodeascraft.wordpress.com/2080/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/etsycodeascraft.wordpress.com/2080/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/etsycodeascraft.wordpress.com/2080/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/etsycodeascraft.wordpress.com/2080/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/etsycodeascraft.wordpress.com/2080/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/etsycodeascraft.wordpress.com/2080/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/etsycodeascraft.wordpress.com/2080/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/etsycodeascraft.wordpress.com/2080/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/etsycodeascraft.wordpress.com/2080/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=2080&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://codeascraft.etsy.com/2012/03/16/come-have-drinks-with-etsy-and-basho/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8261b738d0a330c12b0fb1f65fa1e1f1?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jgoulah</media:title>
		</media:content>
	</item>
		<item>
		<title>Making it Virtually Easy to Deploy on Day One</title>
		<link>http://codeascraft.etsy.com/2012/03/13/making-it-virtually-easy-to-deploy-on-day-one/</link>
		<comments>http://codeascraft.etsy.com/2012/03/13/making-it-virtually-easy-to-deploy-on-day-one/#comments</comments>
		<pubDate>Tue, 13 Mar 2012 20:18:51 +0000</pubDate>
		<dc:creator>John Goulah</dc:creator>
				<category><![CDATA[engineering]]></category>
		<category><![CDATA[infrastructure]]></category>
		<category><![CDATA[chef]]></category>
		<category><![CDATA[deployment]]></category>
		<category><![CDATA[first day]]></category>
		<category><![CDATA[KVM]]></category>
		<category><![CDATA[libvirt]]></category>
		<category><![CDATA[QEMU]]></category>
		<category><![CDATA[virtualization]]></category>

		<guid isPermaLink="false">http://codeascraft.etsy.com/?p=2061</guid>
		<description><![CDATA[At Etsy we have one hard and fast rule for new Engineers on their first day: deploy to production. We’ve talked a lot in the past about our deployment, metrics, and testing processes. But how does the development environment facilitate someone coming in on day one and contributing something that takes them through the steps [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=2061&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>At Etsy we have one hard and fast rule for new Engineers on their first day:  deploy to production.   We’ve talked a lot in the past about our <a href="http://codeascraft.etsy.com/2010/05/20/quantum-of-deployment/" title="quantum of deployment" target="_blank">deployment</a>, <a href="http://codeascraft.etsy.com/2011/02/15/measure-anything-measure-everything/" title="measure anything, measure everything" target="_blank">metrics</a>, and <a href="http://codeascraft.etsy.com/2011/04/20/divide-and-concur/" title="divide and conquer" target="_blank">testing processes</a>.  But how does the development environment facilitate someone coming in on day one and contributing something that takes them through the steps of committing code,  running it through our tests, and deploying it with <a href="https://github.com/etsy/deployinator" title="deployinator" target="_blank">deployinator</a>? </p>
<p>A new engineer’s first task is to snap a photo using our in house photo booth (<a href="http://www.etsy.com/blog/en/2011/a-portrait-of-america-through-the-magnolia-photo-booth" title="etsy photo booth" target="_blank">handmade of course</a>) and upload it to the about page.  Everyone gets a shiny new virtual machine with a working development version of the site,  along with their LDAP credentials,  github write access,  and laptop.  We use an internal cloud system for the VM’s,  mostly because it was the most fun thing to build,  but also gives us the advantage of our fast internal network and dedicated hardware.  The goal is a consistent environment that mirrors production as closely as possible.  So what is the simplest way to build something like this in house?</p>
<p>We went with a KVM/QEMU based solution which allows for native virtualization.  As an example of how you may go about building an internal cloud,  here’s a little bit about our hardware setup.  The hypervisor runs on <a href="http://h20195.www2.hp.com/V2/GetPDF.aspx/4AA0-7762ENW.pdf" title="HP DL380" target="_blank">HP DL380 G7</a> servers that provide us with a total 72G RAM and 24 cores per machine.   We provision 11 guests per server,  which allows each VM 2 CPU cores, 5G RAM, and a 40G hard drive.  Libvirt  supports <a href="http://libvirt.org/migration.html" target="_blank">live migrations</a> across non-shared storage (<a href="http://www.phoronix.com/scan.php?page=news_item&amp;px=NzkwNQ">in QEMU 0.12.2+</a>) with zero downtime which makes it easy to allocate and balance VM’s across hosts if adjustments need to be made throughout the pool. </p>
<p>We create CentOS based VM’s from a disk template that is maintained via <a href="http://glance.openstack.org" title="glance" target="_blank">Openstack Glance</a>, which is a tool that provides services for discovering, registering, and retrieving virtual images.  The most recent version of the disk images are kept in sync via glance, and exist locally on each server for use in the creation of a new VM.  This is faster than trying to pull the image over the network on creation or building it from scratch using <a href="http://fedoraproject.org/wiki/Anaconda/Kickstart" title="kickstart" target="_blank">Kickstart</a> like we do in production.  The image itself may have been kickstarted to match our production baseline, and we template a few key files such as the network and hosts information which is substituted on creation,  but in the end the template is just a disk image file that we copy and reuse.</p>
<p>The VM creation process involves pushing a button on an internal web page that executes a series of steps.  Similar to our one button deployment system,  this allows us to iterate on the underlying system without disruption to the overall process.   The web form only requires a username which must be valid in LDAP so that the user can later login.   From there the process is logged such that it that provides realtime feedback to the browser via <a href="http://dev.w3.org/html5/websockets/" title="websockets" target="_blank">websockets</a>.   The first thing that happens is we find a valid IP in the subnet range, and we use <a href="http://en.wikipedia.org/wiki/Nsupdate" title="nsupdate" target="_blank">nsupdate</a> to add the DNS information about the VM.   We then make a copy of the disk template which serves as the new VM image and use <a href="http://linux.die.net/man/1/virt-install" title="virt-install" target="_blank">virt-install</a> to provision the new machine.  <a href="http://wiki.opscode.com/display/chef/Knife+Bootstrap" title="knife bootstrap" target="_blank">Knife bootstrap</a> is then kicked off which does the rest of the VM initialization using chef.   <a href="http://www.opscode.com/chef/" title="chef" target="_blank">Chef</a> is responsible for getting the machine in a working state,  configuring it so that it is running the same version of libraries and services as the other VM’s, and getting a checkout of the running website. </p>
<p>  Chef is a really important part of managing all of the systems at Etsy,  and we use chef <a href="http://wiki.opscode.com/display/chef/Environments" title="chef environments" target="_blank">environments</a> to maintain similar <a href="http://wiki.opscode.com/display/chef/Cookbooks" target="_blank">cookbooks</a> between development and production.  It is extremely important that development does not drift from production in its configuration.  It also makes it much easier to roll out new module dependencies or software version updates.   The environment automatically stays in sync with the code and is a prime way to avoid strange bugs when moving changes from development to production.  It allows for a good balance between us keeping things centralized, controlled, and in a known-state in addition to giving the developers flexibility over what they need to do.</p>
<p>At this point the virtual machine is functional,  and the website on it can be loaded using the DNS hostname we just created.  Our various tools can immediately be run from the new VM, such as the <a href="http://codeascraft.etsy.com/2011/10/11/did-you-try-it-before-you-committed/" target="_blank">try server</a>,  which is a cluster of around 60 LXC based instances that spawn tests in parallel on your upcoming patch.  Given this ability to modify and test the code easily,  the only thing left is to overcome any fear of deployment by hopping in line and releasing those changes to the world.   Engineers can be productive from day one due to our ability to quickly create a consistent environment to write code in.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/etsycodeascraft.wordpress.com/2061/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/etsycodeascraft.wordpress.com/2061/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/etsycodeascraft.wordpress.com/2061/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/etsycodeascraft.wordpress.com/2061/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/etsycodeascraft.wordpress.com/2061/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/etsycodeascraft.wordpress.com/2061/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/etsycodeascraft.wordpress.com/2061/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/etsycodeascraft.wordpress.com/2061/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/etsycodeascraft.wordpress.com/2061/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/etsycodeascraft.wordpress.com/2061/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/etsycodeascraft.wordpress.com/2061/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/etsycodeascraft.wordpress.com/2061/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/etsycodeascraft.wordpress.com/2061/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/etsycodeascraft.wordpress.com/2061/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=2061&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://codeascraft.etsy.com/2012/03/13/making-it-virtually-easy-to-deploy-on-day-one/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8261b738d0a330c12b0fb1f65fa1e1f1?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jgoulah</media:title>
		</media:content>
	</item>
		<item>
		<title>Scaling CI at Etsy: Divide and Concur, Revisited</title>
		<link>http://codeascraft.etsy.com/2012/03/12/scaling-ci-at-etsy-divide-and-concur-revisited/</link>
		<comments>http://codeascraft.etsy.com/2012/03/12/scaling-ci-at-etsy-divide-and-concur-revisited/#comments</comments>
		<pubDate>Mon, 12 Mar 2012 19:55:50 +0000</pubDate>
		<dc:creator>LB Denker</dc:creator>
				<category><![CDATA[engineering]]></category>
		<category><![CDATA[infrastructure]]></category>
		<category><![CDATA[continuous deployment]]></category>
		<category><![CDATA[continuous integration]]></category>
		<category><![CDATA[github]]></category>
		<category><![CDATA[jenkins]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[testing]]></category>

		<guid isPermaLink="false">http://codeascraft.etsy.com/?p=2029</guid>
		<description><![CDATA[In a past post, Divide and Concur, we told you how we approached dividing our large test suite into smaller test suites by keeping similar tests together rather than arbitrarily dividing. Dividing tests by common points of error made triaging failures systemic failures quick, and enticed everyone to write faster, more deterministic tests, but not [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=2029&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In a past post, <em><a href="http://codeascraft.etsy.com/2011/04/20/divide-and-concur/" title="Divide and Concur" target="_blank">Divide and Concur</a></em>, we told you how we approached dividing our large test suite into smaller test suites by keeping similar tests together rather than arbitrarily dividing.</p>
<p>Dividing tests by common points of error made triaging failures systemic failures quick, and enticed everyone to write faster, more deterministic tests, but not all was perfect.  Our Jenkins dashboard was quite verbose.</p>
<p><a href="http://etsycodeascraft.files.wordpress.com/2012/03/noisyjenkins.png"><img src="http://etsycodeascraft.files.wordpress.com/2012/03/noisyjenkins.png?w=300&h=214" alt="" title="Jenkins Divided" width="300" height="214" class="alignnone size-medium wp-image-2035" /></a></p>
<p>The numerous jobs on our dashboard were great for pinpointing where the failures were, but it was difficult to determine at which stage of the deploy pipeline the failures existed.  Some tests were executed on every commit.  Some tests were executed when the <em>QA</em> button was pushed.  Some tests were executed against a freshly pushed <em>Princess</em> or <em>Production</em> build.  We were using the Jenkins IRC plugin, and the number of messages per hour was drowning out necessary communication in the <code>#push</code> channel.</p>
<p><a href="http://etsycodeascraft.files.wordpress.com/2012/03/6811600706_62b86e091a_o.png"><img src="http://etsycodeascraft.files.wordpress.com/2012/03/6811600706_62b86e091a_o.png?w=262&h=300" alt="" title="6811600706_62b86e091a_o" width="262" height="300" class="alignnone size-medium wp-image-2057" /></a></p>
<p>We needed some way to communicate the test status at each stage of the deployment pipeline.</p>
<p>We considered using <a href="https://wiki.jenkins-ci.org/display/JENKINS/Terminology" title="Jenkins Terminology" target="_blank">Downstream Jobs</a>, but fingerprinting was awkward and difficult to set up, and all-in-all it wasn&#8217;t quite what we were looking for.</p>
<p>We also considered <a href="https://wiki.jenkins-ci.org/display/JENKINS/Building+a+matrix+project" title="Jenkins Matrix" target="_blank">Matrix Jobs</a>, but a Matrix Job is designed to execute several jobs with parameter(s) varied along configuration vector(s), i.e. build node, operating system, browser, arbitrary parameter, etc.  This was not a fit for the purpose because our jobs had wildly different configurations that could not be coerced into mere parameter differences.</p>
<p>What we needed was a way to create a Jenkins job type that would execute a selection of arbitrary Jenkins jobs, wait for the jobs to finish, and report a single result while still making it possible to drill down to sub-jobs to determine the sources of failures.</p>
<p>So we wrote a Jenkins plugin to achieve this, the <a href="https://github.com/etsy/jenkins-master-project" title="Etsy GitHub: Jenkins Master Project Plugin" target="_blank">Jenkins Master Project Plugin</a>.</p>
<p>Now our Jenkins dashboard represents the deployment pipeline:</p>
<p><a href="http://etsycodeascraft.files.wordpress.com/2012/03/pipelinejenkins.png"><img src="http://etsycodeascraft.files.wordpress.com/2012/03/pipelinejenkins.png?w=300&h=137" alt="" title="Pipeline Jenkins Dashboard" width="300" height="137" class="alignnone size-medium wp-image-2037" /></a></p>
<p>When a stage turns red (or yellow), you can click through to that particular Master Build, see what tests failed and drill through the results (or even rebuild).</p>
<p><a href="http://etsycodeascraft.files.wordpress.com/2012/03/drillthrough.png"><img src="http://etsycodeascraft.files.wordpress.com/2012/03/drillthrough.png?w=300&h=183" alt="" title="Master Build Sub-Results" width="300" height="183" class="alignnone size-medium wp-image-2039" /></a></p>
<p>We also wrote a <a href="https://github.com/etsy/jenkins-triggering-user" title="Jenkins Plugin: Triggering User" target="_blank">Triggering User Plugin</a> for determining the user who triggered the build and a <a href="https://github.com/etsy/jenkins-deployinator" title="Jenkins Plugin: Deployinator" target="_blank">Deployinator Plugin</a> to link key <a href="http://codeascraft.etsy.com/2010/05/20/quantum-of-deployment/" title="Quantum of Deployment" target="_blank">Deployinator</a> information to particular Jenkins build.</p>
<p>The <em>Triggering User</em> and <em>Master Project</em> plugins are both integral to our latest version of <a href="http://codeascraft.etsy.com/2011/10/11/did-you-try-it-before-you-committed/" title="Code as Craft: Did You Try It Before You Committed?" target="_blank">Try</a>.</p>
<p>We have also made our <a href="https://github.com/etsy/nagios-jenkins-plugin" title="Etsy GitHub: Nagios Jenkins Plugin" target="_blank">Nagios plugin for Jenkins</a> readily available on the Etsy GitHub account.  We used this for experimenting with alerting on the health of Jenkins.</p>
<p>All of these plugins are freely available on GitHub under the Etsy organization.  Enjoy!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/etsycodeascraft.wordpress.com/2029/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/etsycodeascraft.wordpress.com/2029/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/etsycodeascraft.wordpress.com/2029/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/etsycodeascraft.wordpress.com/2029/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/etsycodeascraft.wordpress.com/2029/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/etsycodeascraft.wordpress.com/2029/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/etsycodeascraft.wordpress.com/2029/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/etsycodeascraft.wordpress.com/2029/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/etsycodeascraft.wordpress.com/2029/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/etsycodeascraft.wordpress.com/2029/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/etsycodeascraft.wordpress.com/2029/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/etsycodeascraft.wordpress.com/2029/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/etsycodeascraft.wordpress.com/2029/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/etsycodeascraft.wordpress.com/2029/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=2029&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://codeascraft.etsy.com/2012/03/12/scaling-ci-at-etsy-divide-and-concur-revisited/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/34d6c586457e85a8155b0a26dbe4f7a0?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">elblinkin</media:title>
		</media:content>

		<media:content url="http://etsycodeascraft.files.wordpress.com/2012/03/noisyjenkins.png?w=300" medium="image">
			<media:title type="html">Jenkins Divided</media:title>
		</media:content>

		<media:content url="http://etsycodeascraft.files.wordpress.com/2012/03/6811600706_62b86e091a_o.png?w=262" medium="image">
			<media:title type="html">6811600706_62b86e091a_o</media:title>
		</media:content>

		<media:content url="http://etsycodeascraft.files.wordpress.com/2012/03/pipelinejenkins.png?w=300" medium="image">
			<media:title type="html">Pipeline Jenkins Dashboard</media:title>
		</media:content>

		<media:content url="http://etsycodeascraft.files.wordpress.com/2012/03/drillthrough.png?w=300" medium="image">
			<media:title type="html">Master Build Sub-Results</media:title>
		</media:content>
	</item>
		<item>
		<title>Going to SxSW? Checkout All Girl* Dev Brunch 2012</title>
		<link>http://codeascraft.etsy.com/2012/03/07/going-to-sxsw-checkout-all-girl-dev-brunch-2012/</link>
		<comments>http://codeascraft.etsy.com/2012/03/07/going-to-sxsw-checkout-all-girl-dev-brunch-2012/#comments</comments>
		<pubDate>Wed, 07 Mar 2012 20:51:57 +0000</pubDate>
		<dc:creator>Kellan Elliott-McCrea</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://codeascraft.etsy.com/?p=2043</guid>
		<description><![CDATA[Next Monday, Garann Means, author of Node for Front-End Developers: Writing Server-Side JavaScript Applications, and Etsy&#8217;s newest engineer, is hosting All Girl* Dev Brunch 2012 . If you&#8217;re a female developer or someone who wants to get more female developers involved in your organization, Austin All-Girl Hack Night and Girl Develop It invite you to [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=2043&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Next Monday, Garann Means, author of <a href="http://shop.oreilly.com/product/0636920023258.do">Node for Front-End Developers: Writing Server-Side JavaScript Applications</a>, and Etsy&#8217;s newest engineer, is hosting <a href="http://garann.com/allgirlhacknight/2012/">All Girl* Dev Brunch 2012 </a>.</p>
<blockquote><p>If you&#8217;re a female developer or someone who wants to get more female developers involved in your organization, Austin All-Girl Hack Night and Girl Develop It invite you to come have brunch with us, talk about dev, and hopefully make some new friends! We&#8217;ll be serving a full breakfast,  including coffee and breakfast cocktails, thanks to our gracious sponsors: Etsy, Bocoup, spire.io, Headspring, and TEKsystems.</p></blockquote>
<p>Also, the <a href="http://garann.com/allgirlhacknight/2012/cli.html">RSVP is a shell</a>, and <code>invitation</code> is executable, so save yourself the pain of trying to <code>cat</code> it like me.</p>
<p>Badge optional, RSVP mandatory, Unix skills strongly suggested to get through the RSVP, a willingness to drink breakfast cocktails recommended.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/etsycodeascraft.wordpress.com/2043/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/etsycodeascraft.wordpress.com/2043/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/etsycodeascraft.wordpress.com/2043/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/etsycodeascraft.wordpress.com/2043/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/etsycodeascraft.wordpress.com/2043/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/etsycodeascraft.wordpress.com/2043/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/etsycodeascraft.wordpress.com/2043/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/etsycodeascraft.wordpress.com/2043/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/etsycodeascraft.wordpress.com/2043/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/etsycodeascraft.wordpress.com/2043/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/etsycodeascraft.wordpress.com/2043/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/etsycodeascraft.wordpress.com/2043/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/etsycodeascraft.wordpress.com/2043/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/etsycodeascraft.wordpress.com/2043/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=2043&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://codeascraft.etsy.com/2012/03/07/going-to-sxsw-checkout-all-girl-dev-brunch-2012/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/01457d1a0f0e533062cd0d1033fb4d7a?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">kellan</media:title>
		</media:content>
	</item>
		<item>
		<title>Google Safe Browsing without The Browser</title>
		<link>http://codeascraft.etsy.com/2012/03/04/google-safe-browsing/</link>
		<comments>http://codeascraft.etsy.com/2012/03/04/google-safe-browsing/#comments</comments>
		<pubDate>Sun, 04 Mar 2012 19:32:23 +0000</pubDate>
		<dc:creator>Nick Galbreath</dc:creator>
				<category><![CDATA[engineering]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://codeascraft.etsy.com/?p=1903</guid>
		<description><![CDATA[At Etsy, we are constantly evaluating the security and safety of our members as they use the site.  One way we do this is by analyzing user generated content (UGC) for possible problems.  As part of the process we integrate results from the Google Safe Browsing (GSB) service. Typically this is client-side technology used by [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=1903&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>At Etsy, we are constantly evaluating the security and safety of our members as they use the site.  One way we do this is by analyzing user generated content (UGC) for possible problems.  As part of the process we integrate results from the <a href="http://code.google.com/apis/safebrowsing/">Google Safe Browsing</a> (GSB) service. Typically this is client-side technology used by web browsers to protect the end-user from visiting dangerous websites that might serve malware or be part of a phishing scam. </p>
<p><a href="http://etsycodeascraft.files.wordpress.com/2012/03/suspected-malware-site.png"><img src="http://etsycodeascraft.files.wordpress.com/2012/03/suspected-malware-site.png?w=300&h=130" alt="" title="Suspected Malware Site" width="300" height="130" class="aligncenter size-medium wp-image-2000" /></a></p>
<p>The <a href="http://www.etsy.com/careers/job_description.php?job_id=olfaWfwq">Security and Defensive Systems</a> group here at Etsy have flipped this model around.  Rather than warn the user when a malicious link is followed, we block the link (or the whole page) from displaying in the first place.</p>
<p>There are a few ways to use the Google Safe Browsing service. For lower volume queries, there is a very simple <a href="http://code.google.com/apis/safebrowsing/lookup_guide.html">REST API</a>.  For high volume, high performance systems, the <a href="http://code.google.com/apis/safebrowsing/developers_guide_v2.html">GSB V2 protocol </a>is more appropriate as it mirrors the entire GSB database locally. It&#8217;s designed to scale to an extremely large number of clients while minimizing network traffic.  To do so, it uses a complicated protocol involving multiple blacklists and whitelists sent as a series of distributed binary diffs.</p>
<p>While many implementations of the GSB protocols are available, for a variety of reasons they were not appropriate for use in Etsy’s operational environment (e.g. use of autoincrement ids, designed to run under a web server, etc), and so we created our own.  We have open sourced our version and made it available in our <a href="https://github.com/etsy/gsb4ugc">gsb4ugc</a> git repository</a>. It&#8217;s in PHP, but it should be straightforward to port to other languages, as it&#8217;s really more of a toolkit than a standalone product. </p>
<p>To use, you&#8217;ll need to create and assemble resources to create your own API. First you need to set up some boilerplate for both the GSB updater and client:</p>
<pre>
// Set up a db connection.
$dbh = new PDO('mysql:host=127.0.0.1; dbname=gsb', 'user', ‘password’);
$dbh-&gt;setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

// Create storage; works with mysql, sqlite.
// No auto-increment IDs, so it's safe with master-master replication.
// Etsy subclasses this and adds <a href="https://github.com/etsy/statsd">StatsD</a> calls. <a href="http://etsy.me/dQwVXi">http://etsy.me/dQwVXi</a>
$storage = new GSB_StoreDB($dbh);

// Create network access. Pass in your <a href="http://code.google.com/apis/safebrowsing/key_signup.html">GSB API key</a>. Uses <a href="http://php.net/manual/en/book.curl.php">PHP curl</a>.
$network = new GSB_Request($api);

// Logger. Subclass to use your logging infrastructure (or not).
$logger = new GSB_Logger(5);
</pre>
<p>Then one needs to setup a cron job that runs every 30 minutes to start mirroring the GSB database. </p>
<pre>
$updater = new GSB_Updater($storage, $network, $logger);
$updater-&gt;downloadData($gsblists, FALSE);
</pre>
<p>It takes about 24 hours to full sync up. Finally, you are able to start checking URLs:</p>
<pre>
$client = new GSB_Client($storage, $network, $logger);
$url = "http://malware.testing.google.test/testing/malware/”;
print_r($client-&gt;doLookup($url));
</pre>
<p>should return something similar to:</p>
<pre>
[list_id] =&gt; 1
[add_chunk_num] =&gt; 70219
[host_key] =&gt; b2ae8c6f
[prefix] =&gt; 51864045
[match] =&gt; malware.testing.google.test/testing/malware/
[hash] =&gt; 518640453f8b2a5f0d43bc2251....
[host] =&gt; testing.google.test/
[url] =&gt; http://malware.testing.google.test/testing/malware/
[listname] =&gt; goog-malware-shavar
</pre>
<p>More details are in the <a href="https://github.com/etsy/gsb4ugc/tree/master/bin-sample">bin/samples</a> directory of our repository.</p>
<p>We are currently scanning a few types of user generated content in production.  This is done asynchronously from the website so we don&#8217;t block the user experience, however we still care about performance.  Almost all performance metrics here at Etsy measure maximum and minimum times, as well as 90th percentile and mean, and this is no exception. The peak times occur when a network call is required, otherwise, it&#8217;s typically 5ms.</p>
<p><a href="http://etsycodeascraft.files.wordpress.com/2012/03/render.png"><img src="http://etsycodeascraft.files.wordpress.com/2012/03/render.png?w=300&h=214" alt="GSB performance graph," title="GSB lookup performance graph, peak, 90%, mean, min" width="300" height="214" class="aligncenter size-medium wp-image-2007" /></a></p>
<p>Since this is security-related code, another goal of <a href="https://github.com/etsy/gsb4u/">gsb4ucg</a> is testability.  The protocol-parsing code is separated out from database and networking code, so it&#8217;s very easy to write <a href="https://github.com/etsy/gsb4u/tree/master/phpunit">unit tests</a>. This also helps to explain how the code works.  As you see below, we have some more work to do:</p>
<p><img src="http://etsycodeascraft.files.wordpress.com/2012/03/http___10-101-194-19_8000_report_.png?w=300&h=44" alt="Code Coverage Detail" title="Detail from Code Coverage Report" width="300" height="44" class="aligncenter size-medium wp-image-2003" /></p>
<p>In addition to expanding test coverage and improving performance, we&#8217;d like to add <a href="http://code.google.com/apis/safebrowsing/developers_guide_v2.html#GetKeyRequests">MAC</a> support, and to use it for more content types on Etsy.  We’d also like to add the results from <a href="http://www.phishtank.com/">PhishTank</a> for completeness and redundancy.  Comments, bug reports, patches and pull requests are all welcome, but if this type of work interests you, consider doing it <a href="http://www.etsy.com/careers/job_description.php?job_id=olfaWfwq">full time</a>.</p>
<p>Now, go forth and browse and consume content safely!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/etsycodeascraft.wordpress.com/1903/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/etsycodeascraft.wordpress.com/1903/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/etsycodeascraft.wordpress.com/1903/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/etsycodeascraft.wordpress.com/1903/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/etsycodeascraft.wordpress.com/1903/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/etsycodeascraft.wordpress.com/1903/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/etsycodeascraft.wordpress.com/1903/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/etsycodeascraft.wordpress.com/1903/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/etsycodeascraft.wordpress.com/1903/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/etsycodeascraft.wordpress.com/1903/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/etsycodeascraft.wordpress.com/1903/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/etsycodeascraft.wordpress.com/1903/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/etsycodeascraft.wordpress.com/1903/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/etsycodeascraft.wordpress.com/1903/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=1903&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://codeascraft.etsy.com/2012/03/04/google-safe-browsing/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/eee9f51462aaf5b4550276bfc3b284fd?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">nickgetsy</media:title>
		</media:content>

		<media:content url="http://etsycodeascraft.files.wordpress.com/2012/03/suspected-malware-site.png?w=300" medium="image">
			<media:title type="html">Suspected Malware Site</media:title>
		</media:content>

		<media:content url="http://etsycodeascraft.files.wordpress.com/2012/03/render.png?w=300" medium="image">
			<media:title type="html">GSB lookup performance graph, peak, 90%, mean, min</media:title>
		</media:content>

		<media:content url="http://etsycodeascraft.files.wordpress.com/2012/03/http___10-101-194-19_8000_report_.png?w=300" medium="image">
			<media:title type="html">Detail from Code Coverage Report</media:title>
		</media:content>
	</item>
		<item>
		<title>The Etsy Way</title>
		<link>http://codeascraft.etsy.com/2012/02/13/the-etsy-way/</link>
		<comments>http://codeascraft.etsy.com/2012/02/13/the-etsy-way/#comments</comments>
		<pubDate>Mon, 13 Feb 2012 13:30:53 +0000</pubDate>
		<dc:creator>Chad Dickerson</dc:creator>
				<category><![CDATA[philosophy]]></category>

		<guid isPermaLink="false">http://codeascraft.etsy.com/?p=1953</guid>
		<description><![CDATA[As you might imagine, we at Etsy get a lot of &#8220;can I pick your brain?&#8221; requests about how we do things at Etsy, or what we&#8217;ll call here The Etsy Way. While we take these requests as huge compliments to the work we do, we have to be somewhat protective of the team&#8217;s time. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=1953&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>As you might imagine, we at Etsy get a lot of &#8220;can I pick your brain?&#8221; requests about how we do things at Etsy, or what we&#8217;ll call here The Etsy Way.  While we take these requests as huge compliments to the work we do, we have to be somewhat protective of the team&#8217;s time.   We&#8217;re proud of what we&#8217;ve been doing and believe in sharing it, so we&#8217;ve invested hundreds (if not thousands) of hours into providing public information on this blog and elsewhere.  This is the best way to scale our sharing as broadly as possible. (And we&#8217;ll still meet with some people &#8212; we&#8217;ll just ask that you read everything below first since we&#8217;ve worked so hard on it!)  Consider this post that first friendly conversation over coffee.</p>
<p>The most important component of The Etsy Way is <em>culture</em> and that is as difficult to teach as it is important.  To get a sense of how we think about culture, take a look at <a href="http://codeascraft.etsy.com/2011/06/06/optimizing-for-developer-happiness/">Optimizing for Developer Happiness</a>, which includes a <a href="http://www.youtube.com/watch?v=22EECFEk9Xs">24-minute video</a> of a talk I did and a link to the <a href="http://www.slideshare.net/chaddickerson/optimizing-for-developer-happiness">accompanying slides</a>.</p>
<p>Here are a few more links about culture: </p>
<ul>
<li><a href="http://blog.chaddickerson.com/2010/08/05/scaling-startups/">Scaling startups</a></li>
<li><a href="http://codeascraft.etsy.com/2011/02/04/how-does-etsy-manage-development-and-operations/">How does Etsy manage development and operations?</a></li>
<li><a href="http://www.slideshare.net/chaddickerson/code-as-craft-building-a-strong-engineering-culture-at-etsy http://www.slideshare.net/chaddickerson/code-as-craft-building-a-strong-engineering-culture-at-etsy">Code as Craft: Building a Strong Engineering Culture at Etsy</a> (slides)</li>
</ul>
<p>With the culture bits explained, below are a few other key posts in the Etsy canon.  All of these are inter-related with the culture, of course, and help reinforce it (remember it&#8217;s all about <em>culture</em>.  Did someone say &#8220;culture&#8221;?):</p>
<p><a href="http://codeascraft.etsy.com/2010/05/20/quantum-of-deployment/">Quantum of Deployment</a> (Erik Kastner).  We deployed code to production more than 10,000 times in 2011.  If you wonder &#8220;how did they do that?&#8221; this post will tell you all you need to know.</p>
<p><a href="http://codeascraft.etsy.com/2010/12/08/track-every-release/">Track every release</a> (Mike Brittain). Here, we write about the methods we use to track the success of every code deploy with application metrics.  This is part of the not-so-secret sauce.</p>
<p><a href="http://codeascraft.etsy.com/2011/02/15/measure-anything-measure-everything/">Measure Anything, Measure Everything</a> (Ian Malpass).  We introduce you to StatsD, the open source software we built at Etsy to enable obsessive tracking of application metrics and just about anything else in your environment.  The best part is you can <a href="https://github.com/etsy/statsd">download StatsD yourself</a> and try it out.</p>
<p><a href="http://codeascraft.etsy.com/2011/04/20/divide-and-concur/">Divide and Concur</a> (Noah Sussman and Laura Beth Denker).  By reading this post, you&#8217;ll learn about all the inner workings of our automated testing setup: what software we use (with plenty of links), how we set it up, and the philosophy behind it all.</p>
<p>We also have tons of slides from talks we have done, all available in the <a href="http://www.slideshare.net/group/code-as-craft/">Code as Craft group on Slideshare</a>.  </p>
<p>And last but not least, we have an <a href="https://github.com/etsy/">Etsy Github repository</a> with lots of goodies.</p>
<p>Pretty much everything we write about above is open source (even the culture) so the motivated reader will find links to tips and actual software along the way to actually set things up on his/her own.  If there&#8217;s anything you&#8217;d like to know more about The Etsy Way, just let us know in the comments.  We&#8217;ll add it if we have it, and probably write it if we don&#8217;t.</p>
<p><em>As you can tell, a really important part of The Etsy Way is encouraging people on the team to <a href="http://github.com/etsy/">contribute to open source</a>, write <a href="http://codeascraft.etsy.com/">informative and entertaining blog posts</a>, and put together <a href="http://www.slideshare.net/group/code-as-craft/">killer presentations</a>.  If you want to join the fun, <a href="http://www.etsy.com/careers/">we&#8217;re always hiring</a>. </em></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/etsycodeascraft.wordpress.com/1953/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/etsycodeascraft.wordpress.com/1953/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/etsycodeascraft.wordpress.com/1953/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/etsycodeascraft.wordpress.com/1953/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/etsycodeascraft.wordpress.com/1953/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/etsycodeascraft.wordpress.com/1953/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/etsycodeascraft.wordpress.com/1953/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/etsycodeascraft.wordpress.com/1953/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/etsycodeascraft.wordpress.com/1953/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/etsycodeascraft.wordpress.com/1953/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/etsycodeascraft.wordpress.com/1953/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/etsycodeascraft.wordpress.com/1953/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/etsycodeascraft.wordpress.com/1953/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/etsycodeascraft.wordpress.com/1953/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=1953&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://codeascraft.etsy.com/2012/02/13/the-etsy-way/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/189b64c70b0ce93763d679ee1a8e0bd1?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">nc expat</media:title>
		</media:content>
	</item>
		<item>
		<title>Upcoming, Etsy Engineering Near You</title>
		<link>http://codeascraft.etsy.com/2012/02/01/upcoming-etsy-engineering-near-your/</link>
		<comments>http://codeascraft.etsy.com/2012/02/01/upcoming-etsy-engineering-near-your/#comments</comments>
		<pubDate>Wed, 01 Feb 2012 21:24:56 +0000</pubDate>
		<dc:creator>Kellan Elliott-McCrea</dc:creator>
				<category><![CDATA[engineering]]></category>
		<category><![CDATA[events]]></category>

		<guid isPermaLink="false">http://codeascraft.etsy.com/?p=1889</guid>
		<description><![CDATA[A few places to look for us in the next few months. Michelle D&#8217;Netto and Lindsey Baron, February 23, Selenium 101 Workshop. Brooklyn, NY. Laura Beth Denker, February 24, Scaling Communication via Continuous Deployment. London. We&#8217;re sponsoring Devopsdays Austin, April 2nd and 3rd. Austin, TX. Look for us. John Goulah, April 11th, Starts with S [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=1889&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>A few places to look for us in the next few months.</p>
<p>Michelle D&#8217;Netto and Lindsey Baron, February 23, <a href="http://www.meetup.com/NYCSelenium/events/52650122/?a=me1o_grp&amp;rv=me1o">Selenium 101 Workshop</a>. Brooklyn, NY.</p>
<p>Laura Beth Denker, February 24, <a href="http://www.phpconference.co.uk/talk/scaling-communication-continuous-integration">Scaling Communication via Continuous Deployment</a>.  London.</p>
<p>We&#8217;re sponsoring <a href="http://www.devopsdays.org/events/2012-austin/">Devopsdays Austin</a>, April 2nd and 3rd.  Austin, TX.  Look for us.</p>
<p>John Goulah, April 11th, <a href="http://www.percona.com/live/mysql-conference-2012/sessions/etsy-shard-architecture-starts-s-and-ends-hard">Starts with S and Ends With Hard: The Etsy Shard Architecture</a>. Santa Clara, CA</p>
<p>Michelle D&#8217;Netto, Stephen Hardisty and Noah Sussman, April 16-18, <a href="http://www.seleniumconf.org/workshops/">Handmade Etsy Tests</a> and <a href="http://www.seleniumconf.org/speakers/">Selenium In the Enterprise: What Went Right, What Went Wrong (So Far)</a>.  London.</p>
<p>Laura Beth Denker, May 22nd, <a href="http://tek12.phparch.com/talks/#Developer-Testing-201-When-to-Mock-and-When-to-Integrate">Developer Testing 201: When to Mock and When to Integrate</a> and <a href="http://tek12.phparch.com/talks/#Its-More-Than-Just-Style">It&#8217;s More Than Just Style</a>.  Chicago, IL</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/etsycodeascraft.wordpress.com/1889/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/etsycodeascraft.wordpress.com/1889/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/etsycodeascraft.wordpress.com/1889/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/etsycodeascraft.wordpress.com/1889/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/etsycodeascraft.wordpress.com/1889/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/etsycodeascraft.wordpress.com/1889/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/etsycodeascraft.wordpress.com/1889/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/etsycodeascraft.wordpress.com/1889/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/etsycodeascraft.wordpress.com/1889/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/etsycodeascraft.wordpress.com/1889/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/etsycodeascraft.wordpress.com/1889/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/etsycodeascraft.wordpress.com/1889/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/etsycodeascraft.wordpress.com/1889/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/etsycodeascraft.wordpress.com/1889/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=codeascraft.etsy.com&#038;blog=16220466&#038;post=1889&#038;subd=etsycodeascraft&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://codeascraft.etsy.com/2012/02/01/upcoming-etsy-engineering-near-your/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/01457d1a0f0e533062cd0d1033fb4d7a?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">kellan</media:title>
		</media:content>
	</item>
	</channel>
</rss>
