Friday, October 30, 2009
URLs and domain names in international characters
Saturday, October 17, 2009
Wednesday, September 16, 2009
improving mysql performance: some notes and links
- NDB mysql clusters. Handles transactions, is about as fast as memory tables, can do replica / master-master clusters, but because it's multi-server it's fundamentally more scalable as a transaction engine.
- Mysql + Hadoop: recently there has been a lot of noise about using them in conjunction as a way to dramatically scale the types of read operations you can do in a replicated relational database environment. E.g., either the data in the tables is enormous, and you want to do complex operations over it all at once that would typically cause a single database server (even though the data might all fit) to blow up. Not sure if someone has a setup working out of the box, but the idea is that you load a "snapshot" into hadoop as you would a new replica, then process binlogs (row replication updates are ideal for this) to update the dataset in the hadoop cluster. Would work pretty smoothly with, say, our ML database. Wonder if the INVERSE could be done too (take a dataset in a hadoop cluster, compute some set of updates, and then generate the appropriate row updates to send to your live DB cluster.
Tuesday, September 15, 2009
ec2 security groups restrictions within the cluster
When authorizing a user/group pair permission, GroupName, SourceSecurityGroupName and SourceSecurityGroupOwnerId must be specified. When authorizing a CIDR IP permission, GroupName, IpProtocol, FromPort, ToPort and CidrIp must be specified. Mixing these two types of parameters is not allowed.http://docs.amazonwebservices.com/AWSEC2/latest/APIReference/index.html?ApiReference-soap-AuthorizeSecurityGroupIngress.html
http://docs.amazonwebservices.com/AWSEC2/latest/CommandLineReference/index.html?ApiReference-cmd-AuthorizeSecurityGroupIngress.html
However, this documentation is straight up WRONG. It is indeed possible, although completely undocumented, to add access restrictions that reference amazon accounts, security groups, protocols, and ports all together. You can only use the SOAP API (as well as the command line tool ec2-authorize, which uses the SOAP api) to do this, not the Query API (what the amazon-ec2 and right-aws gems both use):
~> ec2-authorize backendservers -P tcp -p 8080 -u $AWS_USER_ID -o frontendservers
GROUP backendservers PERMISSION frontendservers ALLOWS tcp 8080 8080 FROM USER (redacted)
Someday this will likely showup in the Query API, but until then, we're stuck coding for SOAP or referencing the command line tools to make use of this lovely feature that you'd expect would be standard. This feature has been live for well over a year (I started using it in spring of 2008), so it's really surprising that it's not available in all of the apis.
Tuesday, September 8, 2009
collectd versus munin
* more efficient C interface than munin's cron/perl-based polling mechanism
* cleaner C interface versus munin's perl
* nice interfaces for ruby, java, C, erlang http://collectd.org/related.shtml
* lots of options for graphing, data manipulation/utilization
* similar default out-of-the-box plugins for all the major stuff; all the other ones had to be customized for munin anyhow.
Friday, August 21, 2009
Hijack: Provides an irb session to an existing ruby process.
Intro
Hijack allows you to connect to any ruby process and execute code as if it were a normal Irb session. Hijack does not require your target process to require any hijack code, Hijack is able to connect to any ruby process. It achieves this by using gdb to inject a payload into the process which starts up a DRb server, Hijack then detaches gdb and reconnects via DRb. Please note that gdb will halt your target process while it is attached, though the injection process is very quick and your process should only be halted for a few milliseconds.
Hijack uses DRb over a unix socket file, so you need to be on the same machine as the process you want to hijack. This is by design for security reasons. You also need to run the hijack client as the same user as the remote process.
Wednesday, August 12, 2009
Design Patterns for Social Experiences
http://asis.org/Bulletin/Aug-09/AugSep09_Crumlish.html
The associated wiki seems pretty good, too. It's a set of design patterns for these things:
http://designingsocialinterfaces.com/patterns.wiki/index.php?title=Main_Page
Wednesday, July 1, 2009
"The Flawed Theory Behind Unit Testing"
My point is that we can't look at testing mechanistically. Unit testing does not improve quality just by catching errors at the unit level. And, integration testing does not improve quality just by catching errors at the integration level. The truth is more subtle than that. Quality is a function of thought and reflection - precise thought and reflection. That’s the magic. Techniques which reinforce that discipline invariably increase quality.
http://michaelfeathers.typepad.com/michael_feathers_blog/2008/06/the-flawed-theo.html
Tuesday, June 30, 2009
Wednesday, June 24, 2009
zero-downtime rolling restarts behind a proxy balancer
http://www.igvita.com/2008/12/02/zero-downtime-restarts-with-haproxy/
website speed and effects on users
Monday, June 22, 2009
Another Enterprise Ruby
DESCRIPTION:
Wish you could write your Ruby in XML? Has the fact that Ruby is not "enterprise" got you down? Do you feel like your Ruby code could be made to be more "scalable"? Well look no further my friend. You’ve found the enterprise gem. Once you install this gem, you too can make Rails scale, Ruby faster, your code more attractive, and have more XML in your life.
I’m sure you’re asking yourself, "how can the enterprise gem promise so much?". Well the answer is easy, through the magic of XML! The enterprise gem allows you to write your Ruby code in XML, therefore making your Ruby and Rails code scale. Benefits of writing your code in XML include:
* It’s easy to read!
* It scales!
* Cross platform
* TRANSFORM! your code using XSLT!
* Search your AST with XPath or CSS!
The enterprise gem even comes with a handy "enterprise" binary to help you start converting your existing legacy Ruby code in to scaleable, easy to read XML files. Let’s start getting rid of that nasty Ruby code and replacing it with XML today!
Sunday, June 14, 2009
Friday, June 5, 2009
filed a Rails Enhancement
This is the patch and test case already applied to our local Rails 2.3.2 tree. It ensures we can catch an exception and reload the object when we access attributes that aren't in the cached data.
Thursday, June 4, 2009
mysql throws "Server shutdown in progress" error when a thread is killed
you ever see it, know that it might not be the actual server shutting
down, but rather just the local client thread to the database being
terminated, possibly by a maintenance operation of some sort.
Friday, May 29, 2009
Copying schema to a new, non-test database
require 'stringio'
# We want to copy the schema from class h[:db] to the database entry
# named "#{h[:prefix]}#{RAILS_ENV}"
#
to_copy = [
{ :db => SomeUseDbAbstractClass,
:prefix => "evaldb_",
}
]
to_copy.each { |h|
db = h[:db]
prefix = h[:prefix]
# Place to stick the schema
s = StringIO.new()
# Dump the schema
ActiveRecord::SchemaDumper.dump(db.connection, s)
# Extract the schema
s.rewind
schema = s.string
# Grab the original connection... we'll need this in a bit.
conn_spec_original = UseDbPluginClass.get_use_db_conn_spec({})
# Grab the destination connection information
conn_spec = UseDbPluginClass.get_use_db_conn_spec(:prefix => prefix)
# SQLite3 doesn't like the ENGINEs we emit... scrubadub.
if conn_spec["adapter"] == "sqlite3"
schema.gsub!(/ENGINE=(MyISAM|InnoDB)/, '')
end
# Move the default AR connection over to our destination database
ActiveRecord::Base.establish_connection(conn_spec)
# Play back the "migration" schema dump
eval schema
# Restore connection to its proper home
ActiveRecord::Base.establish_connection(conn_spec_original)
}
Thursday, May 28, 2009
edge rails non-migration support for populating new tables
We could use to (re)populate our indexes (QuestionIndex, etc.).
an network aggregation utility/
Wednesday, May 27, 2009
S3 and retries
Except that now I'm getting occasional 'getaddrinfo: nodename nor servname provided, or not known (SocketError)'.
So, if you're reading in data stored in S3 and want it to be reliable, you should wrap it with a generic exception handler and a lot of backing-off retries. With luck, it'll work out for you. Without luck, well, what sort of reliability do you expect from a "cloud"?
max_attempts = 25
attempts = 0
begin
my_stuff = s3object.load
rescue Exception => e
attempts += 1
sleep 1
retry if attempts < max_attempts
end
Thursday, May 21, 2009
redis twitter-clone example
Thursday, May 14, 2009
really nice activerecord feature i was unaware of
http://api.rubyonrails.org/classes/ActiveRecord/Calculations/ClassMethods.html
we use it a few places, but i was unaware
--
Love,
Fritz
Monday, May 4, 2009
ruby proxies for testing staging clusters and much more
http://www.igvita.com/2009/04/20/ruby-proxies-for-scale-and-monitoring/
Thursday, April 23, 2009
setting os x terminal window title
Helpful if like me you always have many terminals open.
--
Love,
Fritz
Friday, April 10, 2009
Using ActiveRecord :include to perform join filtering is expensive
self.connections.find(:all, :conditions => ["connections.type IN (?) AND users.active=1", Connection::FRIEND_CONNECTION_TYPES], :order => "users.id ASC", :include => :to).collect(&:to_id).uniq
( a work of art in a single line of Ruby code ;-))
My refactored version using raw SQL looks like this:
quoted_types = Connection::FRIEND_CONNECTION_TYPES.map{|type| "'#{type}'"}
sql = "SELECT DISTINCT(c.to_id) " +
"FROM connections c LEFT JOIN users u ON (c.to_id=u.id) " +
"WHERE c.from_id=#{self.id.to_i} AND c.type IN (#{quoted_types.join(',')}) AND u.active=1 " +
"AND c.to_id NOT IN (SELECT blockee_id FROM blockings WHERE blocker_id=#{self.id.to_i}) AND c.to_id!=#{self.id.to_i}"
Connection.connection.select_values(sql).map(&:to_i)
You'll notice that the SQL version also handled blockings natively now.
Now for the benchmarking using Max's user account with 406 friends:
For the ActiveRecord version:
>> Benchmark.bm(1) { |x| x.report{u.ar_first_degree_friend_ids} }
user system total real
0.400000 0.120000 0.520000 ( 0.615262)
For the SQL version:
>> Benchmark.bm(1) { |x| x.report{max.first_degree_friend_ids} }
user system total real
0.000000 0.000000 0.000000 ( 0.019606)
The performance difference is pretty obvious. This change is not only more correct, but will be 30x faster. We should see a significant performance improvement in cache priming and lazy cache loading as a result.
Thursday, April 9, 2009
Wednesday, April 8, 2009
Log rotation best practices
http://sial.org/howto/logging/rotate/
Their solution suggestion is pretty clear; just switch to logging based on time-based buckets in the first place, so you don't need to do conveyor belt rotation and can always find where the right logs are.
"given the choice of a logrotate scheme, I would first recommend time based buckets, second a startup time scheme, and never hindsight conveyer belt rotation."
Rackspace's approach to logs and analysis over them
See:
http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data
http://blog.racklabs.com/?p=66
Tuesday, April 7, 2009
wrt magic labels in fixtures
# conversations.yml
woodchuck_capabilities:
id: 1
...
# channels.yml
woodchuck_channel:
conversation: :woodchuck_capabilities # <---- NOPE
conversation_id: 1 # <---- YEP
I knew that the magic couldn't span databases, but didn't know that specifying an id disabled that behavior...
--
Love,
Fritz
Friday, April 3, 2009
useful: schema dump in code
ActiveRecord::SchemaDumper.dump(Operation.connection)
--
Love,
Fritz
Re: FYI: poor mysql query planning when ordering by id
rp
Geesh. That's nice work mysql optimizer!
On Thu, Apr 2, 2009 at 7:48 PM, Fritz Schneider
<fritz@themechanicalzoo.com> wrote:
> Paying attention to slow queries today Bob, Nick, and I saw a number of
> seemingly innocuous-looking sql statements that were taking a ridiculously
> long time to execute. For example:
>
> SELECT * FROM `msgs` WHERE address_id = 11295 ORDER BY id DESC LIMIT 1;
>
> There's an index on address_id so there's no reason this should take up to 8
> seconds! Explaining itl:
>
> mysql> explain SELECT * FROM `msgs` WHERE address_id = 11295 ORDER BY id
> DESC LIMIT 1;
> +----+-------------+-------+-------+--------------------------------------------------------------------------------------------------+---------+---------+------+-------+-------------+
> | id | select_type | table | type |
> possible_keys
> | key | key_len | ref | rows | Extra |
> +----+-------------+-------+-------+--------------------------------------------------------------------------------------------------+---------+---------+------+-------+-------------+
> | 1 | SIMPLE | msgs | index |
> index_msgs_on_address_id_and_incoming,index_msgs_on_address_id_and_recognized_cmd_and_created_at
> | PRIMARY | 4 | NULL | 75525 | Using where |
> +----+-------------+-------+-------+--------------------------------------------------------------------------------------------------+---------+---------+------+-------+-------------+
> 1 row in set (0.00 sec)
>
> It's using the id as the key for this query because of the order by,
> resulting in a 75k row scan. Adding an additional order by (should be
> equivalent) solves the problem:
>
> mysql> explain SELECT * FROM `msgs` WHERE (`msgs`.address_id = 11295) ORDER
> BY id DESC, created_at DESC LIMIT 1;
> +----+-------------+-------+------+--------------------------------------------------------------------------------------------------+---------------------------------------+---------+-------+------+-----------------------------+
> | id | select_type | table | type |
> possible_keys
> | key | key_len | ref | rows |
> Extra |
> +----+-------------+-------+------+--------------------------------------------------------------------------------------------------+---------------------------------------+---------+-------+------+-----------------------------+
> | 1 | SIMPLE | msgs | ref |
> index_msgs_on_address_id_and_incoming,index_msgs_on_address_id_and_recognized_cmd_and_created_at
> | index_msgs_on_address_id_and_incoming | 5 | const | 9 | Using
> where; Using filesort |
> +----+-------------+-------+------+--------------------------------------------------------------------------------------------------+---------------------------------------+---------+-------+------+-----------------------------+
>
> I'm fixing up the two specific places I noticed this (msgs,
> routing_suggestion_requests), but it's something to be on the lookout for
> because we order by id all over the place.
>
> I'll be paying much more attention to slow queries now that we have splunk
> in place.
>
> --
> Love,
> Fritz
>
>
Re: FYI: poor mysql query planning when ordering by id
On Thu, Apr 2, 2009 at 7:48 PM, Fritz Schneider
<fritz@themechanicalzoo.com> wrote:
> Paying attention to slow queries today Bob, Nick, and I saw a number of
> seemingly innocuous-looking sql statements that were taking a ridiculously
> long time to execute. For example:
>
> SELECT * FROM `msgs` WHERE address_id = 11295 ORDER BY id DESC LIMIT 1;
>
> There's an index on address_id so there's no reason this should take up to 8
> seconds! Explaining itl:
>
> mysql> explain SELECT * FROM `msgs` WHERE address_id = 11295 ORDER BY id
> DESC LIMIT 1;
> +----+-------------+-------+-------+--------------------------------------------------------------------------------------------------+---------+---------+------+-------+-------------+
> | id | select_type | table | type |
> possible_keys
> | key | key_len | ref | rows | Extra |
> +----+-------------+-------+-------+--------------------------------------------------------------------------------------------------+---------+---------+------+-------+-------------+
> | 1 | SIMPLE | msgs | index |
> index_msgs_on_address_id_and_incoming,index_msgs_on_address_id_and_recognized_cmd_and_created_at
> | PRIMARY | 4 | NULL | 75525 | Using where |
> +----+-------------+-------+-------+--------------------------------------------------------------------------------------------------+---------+---------+------+-------+-------------+
> 1 row in set (0.00 sec)
>
> It's using the id as the key for this query because of the order by,
> resulting in a 75k row scan. Adding an additional order by (should be
> equivalent) solves the problem:
>
> mysql> explain SELECT * FROM `msgs` WHERE (`msgs`.address_id = 11295) ORDER
> BY id DESC, created_at DESC LIMIT 1;
> +----+-------------+-------+------+--------------------------------------------------------------------------------------------------+---------------------------------------+---------+-------+------+-----------------------------+
> | id | select_type | table | type |
> possible_keys
> | key | key_len | ref | rows |
> Extra |
> +----+-------------+-------+------+--------------------------------------------------------------------------------------------------+---------------------------------------+---------+-------+------+-----------------------------+
> | 1 | SIMPLE | msgs | ref |
> index_msgs_on_address_id_and_incoming,index_msgs_on_address_id_and_recognized_cmd_and_created_at
> | index_msgs_on_address_id_and_incoming | 5 | const | 9 | Using
> where; Using filesort |
> +----+-------------+-------+------+--------------------------------------------------------------------------------------------------+---------------------------------------+---------+-------+------+-----------------------------+
>
> I'm fixing up the two specific places I noticed this (msgs,
> routing_suggestion_requests), but it's something to be on the lookout for
> because we order by id all over the place.
>
> I'll be paying much more attention to slow queries now that we have splunk
> in place.
>
> --
> Love,
> Fritz
>
>
Thursday, April 2, 2009
FYI: poor mysql query planning when ordering by id
SELECT * FROM `msgs` WHERE address_id = 11295 ORDER BY id DESC LIMIT 1;
There's an index on address_id so there's no reason this should take up to 8 seconds! Explaining itl:
mysql> explain SELECT * FROM `msgs` WHERE address_id = 11295 ORDER BY id DESC LIMIT 1;
+----+-------------+-------+-------+--------------------------------------------------------------------------------------------------+---------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+--------------------------------------------------------------------------------------------------+---------+---------+------+-------+-------------+
| 1 | SIMPLE | msgs | index | index_msgs_on_address_id_and_incoming,index_msgs_on_address_id_and_recognized_cmd_and_created_at | PRIMARY | 4 | NULL | 75525 | Using where |
+----+-------------+-------+-------+--------------------------------------------------------------------------------------------------+---------+---------+------+-------+-------------+
1 row in set (0.00 sec)
It's using the id as the key for this query because of the order by, resulting in a 75k row scan. Adding an additional order by (should be equivalent) solves the problem:
mysql> explain SELECT * FROM `msgs` WHERE (`msgs`.address_id = 11295) ORDER BY id DESC, created_at DESC LIMIT 1;
+----+-------------+-------+------+--------------------------------------------------------------------------------------------------+---------------------------------------+---------+-------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+--------------------------------------------------------------------------------------------------+---------------------------------------+---------+-------+------+-----------------------------+
| 1 | SIMPLE | msgs | ref | index_msgs_on_address_id_and_incoming,index_msgs_on_address_id_and_recognized_cmd_and_created_at | index_msgs_on_address_id_and_incoming | 5 | const | 9 | Using where; Using filesort |
+----+-------------+-------+------+--------------------------------------------------------------------------------------------------+---------------------------------------+---------+-------+------+-----------------------------+
I'm fixing up the two specific places I noticed this (msgs, routing_suggestion_requests), but it's something to be on the lookout for because we order by id all over the place.
I'll be paying much more attention to slow queries now that we have splunk in place.
--
Love,
Fritz
"Migrating from svn to DVCS"
Migrating from svn to a distributed VCS
Wednesday, April 1, 2009
memoize & Aws::S3
--
Love,
Fritz
Tuesday, March 31, 2009
Friday, March 27, 2009
Rails GET and POST params precedence
Wednesday, March 25, 2009
cpu states in top and %st (steal)
Cpu(s): 19.2%us, 1.0%sy, 0.0%ni, 49.7%id, 0.0%wa, 0.0%hi, 0.0%si, 30.1%st
Most of these fields are fairly well documented:
us -> User
sy -> system
ni -> nice
id -> idle
wa -> iowait
hi -> H/w interrupt requests
si -> S/w interrupt requests
The final field, st, is "extra", implemented on virtualized machines.
The best explanation of it I was able to find was here:
http://www.ibm.com/
%steal
Show the percentage of time spent in involuntary wait by the
virtual CPU or CPUs while the hypervisor was servicing another
virtual processor.
AFAICT, it appears that this doesn't necessarily mean that "some other
user" is stealing cpu resources from you; it could be the servicing of
one of the virtualized resources that is being consumed; if that
utilization is not tied back to your user correctly, it would show up
there.
Tuesday, March 17, 2009
InnoDB can use multiple indices on SELECT
SELECT * FROM `users` WHERE (city_location_id IN (...) OR neighborhood_location_id IN (...))
The second where condition will cause a full table scan even though there is an separate index for both city_location_id and neighborhood_location_id.
However, if you're using InnoDB tables, MySQL can optimize for this buy creating a index UNION for you. Read more about it here if interested.
http://dev.mysql.com/doc/refman/5.0/en/index-merge-intersection.html
Password lengths, MSN, and MSN Messenger/Passport
2009-03-17 07:52:13 UTC ERROR [13095/-611309268] aardvarkec2deve:
Client manager: XMPP Message Error on myhandle@live.com
at msn: () FROM: msn.myserver.com -> TO:
myhandle@myserver.com: The password
you registered with is incorrect. Please re-register with the correct
password.
I'd verified that the password worked by logging in at live.com, so I kept hitting my head against a wall. Finally I tried connecting via an adium client directly to the MSN *messenger* network, which logs in through Microsoft Passport authentication servers. Failure. Finally! Something to work with!
Microsoft appears to have two different maximum lengths on passwords used. When you create a user, you can apparently type however many characters you want into your password (good). However, Microsoft's Password service, which is what authentication of IM users goes through, apparently limits the number of characters you can use (I didn't count what the exact limit was). Once I went in and changed our password to be *shorter* everything worked fine.
So, logging into live.com is not a good test for logging into MSN messenger.
The real source of MSN gateway logout problems
When I looked a few months ago, there wasn't any further info from the Krakken folks (they make the legacy network gateway plugin for openfire). Now though we have this good writeup, confirming that it is indeed:
1) something flakey going on with the avatar protocol with MSN and
2) that the MSN api has changed a lot since they last maintained it.
http://kraken.blathersource.org/node/34
I'm more and more convinced that using Krakken for our IM network connectivity on legacy networks isn't going to keep pace and we'd be better off switching to libpurple.
Monday, March 16, 2009
rake invoke versus execute: a huge difference
First, here's a background example explaining how dependencies aren't run when using execute, but are always run when using invoke.
http://chrisroos.co.uk/blog/2007-12-06-ruby-rake-invoke-vs-execute
I additionally discovered that a task called using invoke will only run the *first* time you invoke it, not a second, third, or fourth time.
In my particular case, I wrote a rake task to stop and start mysql on mac osx, and then used it within a number of other tasks. Each task, when I tested it individually ran fine, but when I tried to chain the series of tasks together, everything went horribly awry, and it looked as if mysql wasn't properly restarting. I went digging for why, and discovered that because I was using invoke on my task, the mysql stop and mysql start task was only running when I called it from the first of the chained tasks. Switching to execute resolved this problem.
Here's the example that demonstrates; note that the code for both invoke and execute tests try to run the stop/start sequence twice (stop/start/stop/start). In practice though, the first test with invoke only runs the stop/start sequence once, while the second with execute runs the stop/start sequence twice.
desc 'stop mysql on macosx, requires sudo'
task :stop_mysql_macosx do
puts "Attempting to stop mysql..."
`sudo /Library/StartupItems/MySQLCOM/MySQLCOM stop`
sleep 2
mysql_process=`ps aux | grep mysql | egrep -v "grep|rake"`
raise "mysql did not stop properly, got mysql_process #{mysql_process}" if !(mysql_process == "")
puts "Successfully stopped mysql."
end
desc 'start mysql on macosx, requires sudo'
task :start_mysql_macosx do
puts "Attempting to start mysql..."
`sudo /Library/StartupItems/MySQLCOM/MySQLCOM start`
sleep 2
mysql_process=`ps aux | grep mysql | egrep -v "grep|rake"`
raise "mysql did not start properly" if (mysql_process == "")
puts "Successfully started mysql."
end
desc 'test start/stop mysql macosx using invoke'
task :test_start_stop_mysql_macosx_using_invoke do
Rake::Task['db:stop_mysql_macosx'].invoke
Rake::Task['db:start_mysql_macosx'].invoke
Rake::Task['db:stop_mysql_macosx'].invoke
Rake::Task['db:start_mysql_macosx'].invoke
end
desc 'test start/stop mysql macosx using execute'
task :test_start_stop_mysql_macosx_using_execute do
Rake::Task['db:stop_mysql_macosx'].execute
Rake::Task['db:start_mysql_macosx'].execute
Rake::Task['db:stop_mysql_macosx'].execute
Rake::Task['db:start_mysql_macosx'].execute
end
~/tmz/aardvark/trunk/dragonfly > sudo rake db:test_start_stop_mysql_macosx_using_invoke
(in /Users/nathan/tmz/aardvark/trunk/dragonfly)
Attempting to stop mysql...
Successfully stopped mysql.
Attempting to start mysql...
Successfully started mysql.
~/tmz/aardvark/trunk/dragonfly > sudo rake db:test_start_stop_mysql_macosx_using_execute
(in /Users/nathan/tmz/aardvark/trunk/dragonfly)
Attempting to stop mysql...
Successfully stopped mysql.
Attempting to start mysql...
Successfully started mysql.
Attempting to stop mysql...
Successfully stopped mysql.
Attempting to start mysql...
Successfully started mysql.
Tuesday, March 10, 2009
i uploaded my xmpp primer to our google docs
originally, but it's a great getting started guide.
--
Love,
Fritz
Friday, March 6, 2009
tokyocabinet schemaless database toolset/api
http://tokyocabinet.sourceforge.net/index.html
Friday, February 27, 2009
schema-less table using MySQL
A nice solution to FriendFeed's scaling problems.
Basically, they're doing a bigtable variant using mysql. The consistency check discussion at the end of the article also seems reasonable.
Friday, February 20, 2009
good to know
--
Love,
Fritz
Thursday, February 19, 2009
dynamically selecting constants from a class
VALID_ADDED_CONTEXTS = consts_starting_with('ADDED_CONTEXT_')
--
Love,
Fritz
Wednesday, February 18, 2009
Get the date in your OS X menubar
To get this is kind of tricky: you go to System Preferences > International > Formats. Click "Customize" in the Dates area, select any date-related fields you want, and copy them to your system clipboard. Then, go back and click "Customize" in the Times area, and choose "Medium". Then, paste in the fields and massage formatting to your heart's content. Et voila, you can have the date fields available in your menubar, beyond just the day of the week.
i did not know you can do this
ActiveRecord::RecordInvalid: Validation failed: Added context is not included in the list
However! You can use {{value}} and when specifying a message, for example:
validates_inclusion_of :added_context, :in => VALID_ADDED_CONTEXTS,
:message => "Value {{value}} not listed as a valid context in Interest::VALID_ADDED_CONTEXTS"
And now if I pass in an illegal value "your mom" we get something much more useful:
Added context Value your mom not listed as a valid context in Interest::VALID_ADDED_CONTEXTS
--
Love,
Fritz
Tuesday, February 17, 2009
Friday, February 13, 2009
Wednesday, February 11, 2009
ah hah! why IDEA sometimes doesn't let you run individual tests
order to run them individually whereas in other test files it will
only let you run the whole thing. It's really annoying. Well I finally
just noticed why: if not all your methods start with test_ (or setup
and teardown I suppose) then it won't let you run individual tests!
I'm not sure why, but I'd like to find out, and it's good to know. If
you have helpers in the test, IDEA won't find the individual tests!
--
Love,
Fritz
creating a password protected zip archive on mac osx
# you can also use fink
~> sudo mv /usr/bin/zip /usr/bin/zip-old
~> sudo ln -s /opt/local/bin/zip /usr/bin/
~> zip -e new_zipfile_name.zip /path/to/file1 /path/to/file2 ...
# enter password when prompted...
# note that this will open perfectly normally under windows
# but under mac osx you need to use either unzip via a terminal
# or download and use stuffit expander to open the file
# mac osx's bomarchivehelper does not handle password zip files
hotmail adds pop, integrates inbox msn chat
the pop3 support will let us eventually fix up our various email forwarding nonsense (currently we forward all email to our Google apps aardvark@vark.com account, and then pop it out; this adds email latency).
the integrated chat obviously is helpful for our product.
Tuesday, February 10, 2009
confluence IM presence plugin
Continuous Deployment at IMVU
A great case study for continuous deployment. We have a bit of work to do before we reach this point, but we have the right foundation.
Friday, February 6, 2009
Will it parse? i =+ 1
The answer is, of course, yes! Ruby just assigns +1 to i. I discovered this when my confusion matrix came out as all zeros and ones...
Thursday, February 5, 2009
an example of what NOT to do from the ruby standard library
class ConditionVariable
class Timeout < Exception; end
# Create a new timer with the argument timeout, and add the
# current thread to the list of waiters. Then the thread is
# stopped. It will be resumed when a corresponding #signal
# occurs.
def wait(timeout = nil)
@monitor.instance_eval {mon_check_owner()}
timer = create_timer(timeout)
Thread.critical = true
count = @monitor.instance_eval {mon_exit_for_cond()}
@waiters.push(Thread.current)
begin
Thread.stop
return true
rescue Timeout
return false
ensure
Thread.critical = true
begin
...
--
Love,
Fritz
Wednesday, February 4, 2009
Tuesday, February 3, 2009
openfire has a simple presence plugin
This would allow some simple dashboard status reporting of aardvark's legacy handle status as reported by openfire, w/o requiring opening up or logging into the admin console.
Probably not as useful as just getting heartbeat monitoring working on our legacy network handles, but the presence page in openfire's admin console has been a useful debugging tool in the past for times when when openfire is mis-reporting the status of a handle.
Sunday, February 1, 2009
use_db and Transactional Fixtures
First off, I should note, use_db "is a piece of crap. Don't use it"...this according to its project page on github: http://github.com/jcnetdev/use_db/tree/master.
In the databuilder project, we use the ML db as the primary and configured use_db for to use aardvark database for shared models. While fixtures for aardvark database located in test/fixtures seem to work, transaction fixtures do not. We poked around in the use_db library and couldn't find where it went wrong. After 20 mins, we gave up.
So AFAWK, use_db doesn't adequately support transactional fixtures on the secondary database.
Friday, January 30, 2009
nice module pattern
module VarkLogging
# any constants you want to define
VARK_LOG_MESSAGE = "vark log!"
def self.included(base)
base.send(:include, InstanceMethods)
base.send(:extend, ClassMethods)
base.class_eval do
# any class methods we need to call
alias_method_chain :add, :vark_logging
end
end
module InstanceMethods
def add_with_vark_logging(*args)
puts VARK_LOG_MESSAGE
add_without_vark_logging(*args)
end
end
module ClassMethods
def vark?
true
end
end
end
# and if you want to include it without altering the original class definition:
Logger.send(:include, VarkLogging)
Mounting extra disks on ec2 instances
~> cat /proc/partitionsHere we note that /dev/sda1 and /dev/sdb are in use, while /dev/sdc /dev/sdd and /dev/sde are not yet in use
major minor #blocks name
8 16 440366080 sdb
8 32 440366080 sdc
8 48 440366080 sdd
8 64 440366080 sde
8 1 10485760 sda1
~> df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 9.9G 1.6G 7.9G 17% /
/dev/sdb 414G 199M 393G 1% /mnt
none 3.6G 0 3.6G 0% /dev/shm
We want to check the fstab to see what the disktypes are. Also edit this file if you want your mounted disk to come back up on a reboot! (In Ec2-land all of these filesystems are non-persistent, so if your machine more than reboots, you'll lose the data entirely.
~> more /etc/fstabCreate your mount point:
/dev/sda1 / ext3 defaults 1 1
/dev/sdb /mnt ext3 defaults 0 0
none /dev/pts devpts gid=5,mode=620 0 0
none /dev/shm tmpfs defaults 0 0
none /proc proc defaults 0 0
none /sys sysfs defaults 0 0
~> mkdir /mnt2Mount the disk:
~> mount -t ext3 /dev/sdc /mnt2And check its availability:
~> df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 9.9G 1.6G 7.9G 17% /
/dev/sdb 414G 199M 393G 1% /mnt
none 3.6G 0 3.6G 0% /dev/shm
/dev/sdc 414G 199M 393G 1% /mnt2
Dealing with Capistrano's cached-copy when switching the branch you're releasing from
e.g., when capistrano executes
~> cap deploy:updateit runs something like the following:
executing “if [ -d /deploy/myproject/shared/cached-copy ]; then svn update -q -r3131 /deploy/myproject/shared/cached-copy; else svn checkout -q -r3131 https://svn.example.com/svn/myreponame/tags/release-1.2.1/myproject /deploy/myproject/shared/cached-copy; fi”This can cause some problems when you try to switch the repository branch you're deploying from, say, to roll back to a tag, or deploy from a radically different stable branch where svn up won't run properly because of deleted directories etc.
So, before you release from a tag, you have to remember to delete the cached copy checked out on the servers. Capistrano's invoke comes in handy for these situations:
~> cap invoke COMMAND="rm -rf /deploy/myproject/shared/cached-copy"You might want to also remember to delete the cached-copy *after* you deploy too, in case the next person who comes along doesn't realize that you deployed from a strange location.
i can't believe it's this hard to set default values at the AR object level (not the db level)
before_save :maybe_set_source
def maybe_set_source
self[:source] = source_name if !self[:source]
end
def source
self[:source] || source_name
end
# The default value I want to set
def source_name
File.basename(RAILS_ROOT)
end
Really?
Thursday, January 29, 2009
assert_yields
def assert_yields(something_callable, arg)
yielded = false
something_callable.call(arg) do
yielded = true
end
assert yielded
end
# Called like:
assert_yields(some_obj.method(:do_something), my_arg)
One could easily make it handle a variable number of arguments.
Object is not missing constant Question!
You referenced a class that wasn't defined, so ActiveSupport's dependency resolver kicked in.
It loaded the appropriate file (in this case question.rb), but during the load an Exception was raised. The error was not, however, in your class definition, so the missing constant (Question) was successfully defined.
The resolver then catches that exception, expects that your class will not be defined but finds that it is, and decides to throw an "Object is not missing constant Question!" because it's confused.
tell it, joel
position (and, by implication, what to look for):
http://www.joelonsoftware.com/items/2009/01/02b.html
--
Love,
Fritz
Wednesday, January 28, 2009
overriding string comparisons
because Question was also overriding String#==.
--
Love,
Fritz
this should come as no surprise
http://blog.dalethatcher.com/2008/03/rails-dont-override-initialize-on.html
In fact, don't override anything in AR :)
--
Love,
Fritz
Friday, January 23, 2009
chill that Dock out!
vatter-effing bouncing when they want your attention:
http://www.macworld.com/article/138403/2009/01/dockbounce.html?lsrc=rss_weblogs_macosxhints
--
Love,
Fritz
Thursday, January 22, 2009
great splunk howtos
http://www.igvita.com/2008/06/19/splunk-your-distributed-logs-in-ec2/
using syslog-ng:
http://www.igvita.com/2008/10/22/distributed-logging-syslog-ng-splunk/
AR constructs chained named scope queries in the order opposite what you'd expect
Conversation.in_the_last(1.day).
with_question.
not_error.
not_canceled.
not_flagged.
not_answered.
not_involving(user)
You'd expect the created_at restriction to come first in the resulting SELECT, wouldn't you? Nope:
SELECT * FROM `conversations` WHERE (((((((NOT EXISTS (SELECT chs3.id FROM channels chs3 WHERE conversations.id = chs3.conversation_id AND chs3.user_id = 176)) AND (NOT EXISTS (SELECT chs2.id FROM channels chs2 WHERE conversations.id = chs2.conversation_id AND chs2.answer_request_response = 'answer'))) AND (NOT EXISTS (SELECT chs1.id FROM channels chs1 WHERE conversations.id = chs1.conversation_id AND chs1.flagged = 1))) AND (`conversations`.`canceled` = 0 )) AND (`conversations`.`error` = 0 )) AND (EXISTS (SELECT chs4.id FROM channels chs4 WHERE conversations.id = chs4.conversation_id AND chs4.asker = 1 AND chs4.has_question = 1))) AND (created_at > '2009-01-21 16:38:23'));
AR sticks it at the end of the query and as a result it runs in 6-10 seconds on acceptance (w/no query cache). Applying the created_at restriction at the end of the chain constructs a query with it at the beginning:
Conversation.with_question.
not_error.
not_canceled.
not_flagged.
not_answered.
not_involving(user).
in_the_last(1.day)
SELECT * FROM `conversations` WHERE (((((((created_at > '2009-01-21 18:07:29') AND (NOT EXISTS (SELECT chs3.id FROM channels chs3 WHERE conversations.id = chs3.conversation_id AND chs3.user_id = 176))) AND (NOT EXISTS (SELECT chs2.id FROM channels chs2 WHERE conversations.id = chs2.conversation_id AND chs2.answer_request_response = 'answer'))) AND (NOT EXISTS (SELECT chs1.id FROM channels chs1 WHERE conversations.id = chs1.conversation_id AND chs1.flagged = 1))) AND (`conversations`.`canceled` = 0 )) AND (`conversations`.`error` = 0 )) AND (EXISTS (SELECT chs4.id FROM channels chs4 WHERE conversations.id = chs4.conversation_id AND chs4.asker = 1 AND chs4.has_question = 1)))
Apparently it can now use the created_at to eliminate rows in the scan. The resulting query time is about 300ms on acceptance (again, w/no query cache).
Wednesday, January 21, 2009
Percona: How to Outrun The Lions
Great presentation on MySQL scaling.
http://www.scribd.com/doc/3929163/Percona-How-to-Outrun-The-Lions
some really nice things about working at a startup...
http://blog.jayfields.com/2009/01/questions-to-ask-interviewer.html
--
Love,
Fritz
Tuesday, January 20, 2009
2 gems from SmugMug
http://blogs.smugmug.com/don/2008/06/03/skynet-lives-aka-ec2-smugmug/
Also here are some pointers from them on mysql; the recommendation on Percona is something we should keep in mind should we need to call in some db expertise.
http://blogs.smugmug.com/don/2008/12/23/great-things-afoot-in-the-mysql-community/
--
Love,
Fritz
Monday, January 19, 2009
Anti-RDBMS: A list of distributed key-value stores
Anti-RDBMS: A list of distributed key-value stores
Friday, January 16, 2009
Craigslist uses Sphinx
Well, I guess the cat's out of the bag! My first project at Craigslist was replacing MySQL FULLTEXT indexing with Sphinx. It wasn't the easiest road in the world, for a variety of reasons, but we got it all working and it's been humming along very well ever since.
I'm not going to go into a lot of details on the implementation here, other than to say Sphinx is faster and far more resource efficient than MySQL was for this task.
http://tinyurl.com/7gkwbz
Thursday, January 15, 2009
new in rails 2.2
I definitely don't understand all the implications of threading or
connection pools. The automatic memoization is cool though.
--
Love,
Fritz
Building products users love by giving them control
Finding unused indexes in mysql
http://hackmysql.com/mysqlidxchk
--
Love,
Fritz