Load Average: 2013

Monday 29 April 2013

Chef Experiments - Create Users

The objective here is to create a users cookbook with data bags

Create the data bag

knife data bag create user_config

Create the user json file

data_bags/users/usr_sri.json

"id": "sri",
{

    "comment": "Sriram Rajan",

    "uid": 2000,

    "gid": 0,

    "home":"/home/sri",

    "shell":"/bin/bash",

    "pubkey":"<replace with the SSH public key"

}

Import the file

knife data bag from file users_config usr_sri.json

Create a key for the encrypted data bag

openssl rand -base64 512 > data_bags/users/enckey

Create the encrypted data bag

knife data bag create --secret-file data_bags/users/enckey password_config pwdlist

Edit the data bag

knife data bag edit --secret-file data_bags/users/enckey password_config pwdlist

"id": "pwdlist",
{
"sri": "Replace with SHA password string"
}

At this point you should have a data bag with users and encrypted data bag with passwords. Now we move to the cookbook

Create the cookbook

knife cookbook create user_config

Recipe looks like this. We add the user and also ensure the .ssh directory is created and populated with the public keys. The password will be pulled from the encrypted bag.

decrypted = Chef::EncryptedDataBagItem.load("password_config", "pwdlist")
search(:user_config, "*:*").each do |user_data|
    user user_data['id'] do
        comment user_data['comment']
        uid user_data['uid']
        gid user_data['gid']
        home user_data['home']
        shell user_data['shell']
        manage_home true
        password decrypted[user_data['id']
        action:create
    end
  
    ssh_dir = user_data['home'] + "/.ssh"
    directory ssh_dir do
        owner user_data['uid']
        group user_data['gid']
        mode "0700"
    end

    template "#{ssh_dir}/authorized_keys" do
        owner user_data['uid']
        group user_data['gid']
        mode "0600"
        variables(
             :ssh_keys => user_data['pubkey']
             )
        source "authorized_keys.erb"
    end
end

The template file

base_users/templates/default/authorized_keys.erb

<% Array(@ssh_keys).each do |key| %>

<%= key %>

<% end %>

Finishing up

knife cookbook upload user_config

Ensure the secret key for the encrypted data bag is also sent to the node and stored under /etc/chef/encrypted_data_bag_secret. You can bootstrap this file into the node build. See http://docs.opscode.com/essentials_data_bags_encrypt.html

Then add the recipe to a role or node run list and run the chef-client to test.

Designing in the cloud

Service based model
This is not a very new concept (http://en.wikipedia.org/wiki/Service-oriented_architecture) but the cloud model makes this very important. Build your business model such that it can be consumed as a service. This would also force you to modularize parts and all this would ensure you have a high degree of portability.

Build for failure
Cloud is multi-tenant in most cases and with it comes challenges like noisy neighbours or failure of individual components. Build for these scenarios. Symian army (http://techblog.netflix.com/2011/07/netflix-simian-army.html) talks a lot about this and is an interesting read. Importantly plan for "What happens when"

In building for failure you are also creating a good recovery model. One the benefits of running everything as code means that you can recover faster and this would translate into better uptime.

Cloud is all about the API and pluggabiliity. Think about building a top level API for your business model. Then use vendor APIs and plug them into your API. Wherever possible, loosely couple your application interaction. For e.g., instead of direct database calls use an API

Monitoring
Monitoring becomes more than just making sure your applications are working fine. If you leverage multiple cloud providers, you can use it to make operational decisions. You can use it to go with the best cloud provider and save costs. You can use it to find low performing instances within the same provider. One important point here is to make sure your monitoring is vendor agnostic and wherever possible not a tool provided by the vendor. Frameworks like Sensu (http://www.sonian.com/cloud-monitoring-sensu/) or tools like Riemann(http://riemann.io) can help

Automation
Cloud will force automation to a large extent and you need to embrace it. Automation also allows you to build across different vendors When using multiple vendors, us a model that works on all platforms. There are open source libraries like libcloud which provide vendor agnostic ways. Be careful with using automation providers as you can get vendor lock-in in a different way. While building your own autoscale model is complex in the long term, there is a lot more to gain as it will fit your business model.

Think about data
Cloud provides commodity based services for things like compute, storage etc but your data is not commodity. So think about distributing this over different vendors or build that into your recovery model.

Think about security
Security in the cloud is a hot topic and it is safe to say that this is still evolving. This is also something that is overlooked while you plug in other nuts and bolts. Make sure things like identity management, access control models are at the heart of your cloud strategy. Even if security is not an immediate requirement, you can build them as services which can be implemented at a later stage.

Wednesday 13 March 2013

Chef Experiments - Create SSH config

The objective here is to create a simple sshd cookbook for Red Hat/CentOS configuration.

Create the cookbook

knife cookbook create sshd

Create the default recipe.
Options like sshd port , banner etc will be pulled from a data bag called base_config. The template file for SSHD configuration would be sshd.erb.

File : cookbooks/sshd/recipes/default.rb

sshd_config = data_bag_item('base_config', 'sshd')
template "/etc/ssh/sshd_config" do
    source "sshd.erb"
    mode "0644"
    variables(
    :sshd_port => sshd_config['port'],
    :x11_forwarding => sshd_config['x11_forwarding'],
    :banner => sshd_config['banner'],
    :permit_root => sshd_config['permit_root']
)
end

template "/etc/issue.net" do
    source "issue.net.erb"
end

service "sshd" do
    action [ :restart ]
end

Templates sshd.erb looks like this

File : cookbooks/sshd/template/sshd.erb

Port <%= @sshd_port %>

Protocol 2

#other SSHD config has been omitted for the sake of the blog post

PermitRootLogin <%= @permit_root %>

X11Forwarding <%= @x11_forwarding %>

Banner <%= @banner %>

#other SSHD config has been omitted for the sake of the blog post

In issue.net.erb we are are reading from motd and adding a little blurb after that

File : cookbooks/sshd/template/issue.net.erb

<%= File.read("/etc/motd") %>

*****************************
Use of the Site by unauthorized users is prohibited and 
unauthorized users will be prosecuted to the fullest 
extent of the law.
*****************************

Data bag

Create the data bag

File :data_bags/base)config/config.json

{  
  "id": "sshd",  
  "port": "2222",  
  "x11_forwarding": "no",  
  "banner":"/etc/issue.net",  
  "permit_root": "no"  
}

Finishing up

knife data bag create base_config

knife data bag from file base_config data_bags/base_config/config.json

knife cookbook upload sshd

Then add the recipe to a role or node run list and run the chef-client

Wednesday 6 March 2013

Chef Experiments - Create host files

I am in the process of experimenting with Chef and here's one of them

knife create cookbook host_file_update

Then create the recipe

recipes/default.rb

hosts = search(:node, "*:*")
template "/etc/hosts" do
  source "hosts.erb"
  owner "root"
  group "root"
  mode 0644
  variables(
    :hosts => hosts,
    :hostname => node[:hostname],
    :fqdn => node[:fqdn]
  )
end

And then the template file hosts.erb referenced above

templates/default/hosts.erb

127.0.0.1   localhost

<% @hosts.each do |node| %>
<%= node['ipaddress'] %> <%= node['hostname'] %> <%= node['fqdn'] %>
<% end %>

Pretty useful, if you want to populate this automatically as and when you add servers. One of the next things to try is see if we can make Chef pick the additional IPs (e.g service net in Rackspace cloud) and create separate entries for it

Sunday 3 March 2013

Mysql Information Schema

Re-publishing from my wiki.

Show tables that use Barracuda disk format

select * from INFORMATION_SCHEMA.TABLES where TABLE_SCHEMA NOT IN ('mysql', 'INFORMATION_SCHEMA', 'performance_schema') AND ( ROW_FORMAT='Compressed' OR ROW_FORMAT='Dynamic');

Show me all tables that are InnoDB

SELECT `table_schema`, `table_name` FROM `information_schema`.`TABLES` WHERE `Engine`='Innodb' AND `TABLE_SCHEMA` !='information_schema' AND `TABLE_SCHEMA` !='mysql';

Show me all tables that are MyISAM

SELECT `table_schema`, `table_name` FROM `information_schema`.`TABLES` WHERE `Engine`='MyISAM' AND `TABLE_SCHEMA` !='information_schema' AND `TABLE_SCHEMA` !='mysql';

Print Queries to aid in conversion FROM MyISAM to InnoDB

use `information_schema`; SELECT CONCAT("ALTER TABLE `" , `TABLE_SCHEMA`, "`.`", `table_name`, "` Engine=Innodb;") AS "" FROM `information_schema`.`TABLES` WHERE `Engine`='MyISAM' AND `TABLE_SCHEMA` !='information_schema' AND `TABLE_SCHEMA` !='mysql';

You can save the above in a file and run this
mysql --batch < input.sql > out.sql

Show me a count of tables grouped by engine type

SELECT `Engine`, count(*) as Total FROM `information_schema`.`TABLES` WHERE `TABLE_SCHEMA` !='information_schema' AND `TABLE_SCHEMA` !='mysql' GROUP BY `Engine`;

Show me the datasize and index size of all tables grouped by engine type

SELECT `Engine`, COUNT(ENGINE), sum(data_length)/(1024*1024*1024) as 'Datasize-GB', sum(index_length)/(1024*1024*1024) as 'Indexsize-GB' FROM `information_schema`.`TABLES` GROUP BY `Engine`;

Show me the top 10 tables by size outside of information_schema and mysql

SELECT TABLE_SCHEMA, TABLE_NAME,data_length/1024*1024 FROM `information_schema`.`TABLES` WHERE `TABLE_SCHEMA` !='information_schema' AND `TABLE_SCHEMA` !='mysql' ORDER BY `data_length` DESC LIMIT 10;

Tables without indexes

USE `information_schema`; SELECT CONCAT(TABLES.table_schema,".",TABLES.table_name) as name, `TABLES`.`TABLE_TYPE`,`TABLE_ROWS` FROM `TABLES` LEFT JOIN `TABLE_CONSTRAINTS` ON `TABLES`.`table_schema` = `TABLE_CONSTRAINTS`.`table_schema` AND `TABLES`.`table_name` = `TABLE_CONSTRAINTS`.`table_name` AND `TABLE_CONSTRAINTS`.`constraint_type` = 'PRIMARY KEY' WHERE `TABLE_CONSTRAINTS`.`constraint_name` IS NULL; Check for redundant indexes SELECT * FROM ( SELECT `TABLE_SCHEMA`, `TABLE_NAME`, `INDEX_NAME`, GROUP_CONCAT(`COLUMN_NAME` ORDER BY `SEQ_IN_INDEX`) AS columns FROM `information_schema`.`STATISTICS` WHERE `TABLE_SCHEMA` NOT IN ('mysql', 'INFORMATION_SCHEMA') AND NON_UNIQUE = 1 AND INDEX_TYPE='BTREE' GROUP BY `TABLE_SCHEMA`, `TABLE_NAME`, `INDEX_NAME` ) AS i1 INNER JOIN ( SELECT `TABLE_SCHEMA`, `TABLE_NAME`, `INDEX_NAME`, GROUP_CONCAT(`COLUMN_NAME` ORDER BY `SEQ_IN_INDEX`) AS columns FROM `information_schema`.`STATISTICS` WHERE INDEX_TYPE='BTREE' GROUP BY `TABLE_SCHEMA`, `TABLE_NAME`, `INDEX_NAME` ) AS i2 USING (`TABLE_SCHEMA`, `TABLE_NAME`) WHERE i1.columns != i2.columns AND LOCATE(CONCAT(i1.columns, ','), i2.columns) = 1

List character sets

SELECT `TABLE_SCHEMA`, `TABLE_NAME`, `CHARACTER_SET_NAME`, `TABLE_COLLATION` FROM `INFORMATION_SCHEMA`.`TABLES` INNER JOIN `INFORMATION_SCHEMA`.`COLLATION_CHARACTER_SET_APPLICABILITY` ON (`TABLES`.`TABLE_COLLATION` = `COLLATION_CHARACTER_SET_APPLICABILITY`.`COLLATION_NAME`) WHERE `TABLES`.`TABLE_SCHEMA` !='information_schema' AND `TABLES`.`TABLE_SCHEMA` !='mysql' ;

List average row length and index length

SELECT CONCAT (`TABLE_SCHEMA`, "." , `TABLE_NAME`) as name , `AVG_ROW_LENGTH`, `DATA_LENGTH`, `INDEX_LENGTH` FROM `TABLES` ORDER BY `AVG_ROW_LENGTH` DESC LIMIT 15;

Oldest tables with respect to update times

SELECT CONCAT (`TABLE_SCHEMA`, "." , `TABLE_NAME`) as name , `UPDATE_TIME` FROM `TABLES` WHERE `UPDATE_TIME` IS NOT NULL ORDER BY `UPDATE_TIME` LIMIT 10;

Tables with foreign keys

SELECT * FROM `table_constraints` WHERE `constraint_type` = 'FOREIGN KEY' ;

List of indexes and their total count

SELECT `INDEX_TYPE`, count(*) as NUM FROM `STATISTICS` group by `INFORMATION_SCHEMA`.`INDEX_TYPE`;

Get a summary of privileges

SELECT * from `INFORMATION_SCHEMA`.`USER_PRIVILEGES`;

Wednesday 20 February 2013

Rackspace cloud files - symlinks / aliases

The requirement is to have multiple names to the same object. For starters, there is no inbuilt way to do aliases or multiple names to the same object. However, after some documentation trolling there is a way to achieve it, although it is not straightforward.

Cloud files offers large file (over 5G) support by allowing multiple segments to be uploaded and then a manifest that links the segments.

http://www.rackspace.com/blog/rackspace-cloud-files-now-supporting-extremely-large-file-sizes/

You can use this feature to sort of achieve symlinking/aliasing

Here's a small example

curl -D - \

     -H "X-Auth-Key: AUTH key" \

     -H "X-Auth-User: user" \

     https://lon.identity.api.rackspacecloud.com/v1.0



curl -X PUT -H 'X-Auth-Token: AUTH TOKEN' \

https://storage101.lon3.clouddrive.com/v1/<URL>/stream/imagedata1/1 --data-binary 'Image1'



curl -X PUT -H 'X-Auth-Token: AUTH TOKEN' \

-H 'X-Object-Manifest: sriramrajan.com/imagedata1/' \

https://storage101.lon3.clouddrive.com/v1/<URL>/stream/image1.txt --data-binary ''



curl -X PUT -H 'X-Auth-Token: AUTH TOKEN' \

-H 'X-Object-Manifest: sriramrajan.com/imagedata1/' \

https://storage101.lon3.clouddrive.com/v1/<URL>/stream/image2.txt --data-binary ''

This can be on CDN enabled containers as well and so once you do this, the following will be technically pointing to the same object.

<CDN URL>/image1.txt

<CDN URL>/image2.txt