Archive for the 'Theoretical Hacks' Category

Splitting bioinformatics FASTA files

I keep forgetting where my scripts were in my home directories. Below is my ruby script to split a large FASTA [1] sequence into N sequences per file:

#!/usr/bin/env ruby
#
# Script: dumpseq.rb
# Description: Parses the a BLAST Fasta file and dumps each sequence to a
#              file.
# Usage: dumpseq.rb [fasta_file]

require 'fileutils'

fasta_db  = File.new(ARGV[0])

sno = 0
d = 0

file = nil

while true
  x = fasta_db.readline("n>").sub(/>$/, "")
  x =~ />(.*)n/
  if sno % N == 0 # N seqs per query
    file.close if file != nil
    dir = sprintf("D%04d000", d / 1000)
    FileUtils.mkdir_p dir
    # short filenames
    fname = sprintf "SEQ%07d.fasta", d
    d += 1
    file = File.new("#{dir}/#{fname}","w")
  end
  file << x
  sno += 1
  fasta_db.ungetc ?>
end

Its pretty hackish-looking. But then I found out that BioRuby [2] wrappers for parsing FASTA files.

[1] http://en.wikipedia.org/wiki/Fasta
[2] http://www.bioruby.org

Related Posts Related Websites

Why do research in the Philippines?

Because installing grid computing middleware can get you to this:

7th PANDA Grid Workshop, Bohol, Philippines, May 4 - 8, 2009
organised by
Ateneo de Manila University
Sponsored also by EPSRC, IoP, PPARC and the Royal Society of Edinburgh
The aim of the workshop is to bring together grid administrators and software developers in an informal setting, involving open discussions. The focus will include grid maintenance and monitoring and data production with PandaRoot.

Organising committee:

    Rafael P. Saldana (Ateneo)
    Kilian Schwarz (GSI)
    Dan Protopopescu (Glasgow)

Contact person:

Address:

    Holy Name University,
    Lesage and Gallares Streets,
    6300 Tagbilaran City,
    Bohol, Philippines

Let’s look at the itinerary:

Tagbilaran City (May 3, 4, 5)

Metro Centre Hotel and Convention Center
Pres. Carlos P. Garcia Avenue
Tagbilaran City, Bohol
Philippines, 6300
Website: www.metrocentrehotel.com

Panglao Island (May 6, 7, 8, 9, 10)

Bohol Beach Club
Bo. Bolod, Panglao Island, Bohol 6340
Website: www.boholbeachclub.com.ph

Shet, gusto kong umuwi!!!

Related Posts Related Websites

Chicago Startup Factory

The event is a collaboration between the GSB and CS Department. The group hopes to create technology-heavy startups and businesses unlike when you gather a bunch of pure business people who can’t make a business plan other than canned food, a network of juice/ shake stands, etc. The speaker for the Startup Factory talk was Adarsh Arora, CEO of Athena Security and Co-Founder of Lisle Technology Partners. I took some of his striking ideas about innovating and generating business plans around technology:

  • never sell more than one innovation - his rationale for this was that the market cannot catch-up with all of your ideas. I have not thought of this deeply because [1] I have yet to have a really brilliant idea, and [2] most busines models I saw are too caught up in selling this one unique idea that they don’t bother to look at the other types (probably they are bad ideas in the first place).
  • interdisciplinary collaboration - now this is more familiar to my school of thought. As what we always say in the Ateneo Innovation Center, today’s problems are so complex that you need to apply every type of paradigm to be able to attack the problem from different angles and come up with a brilliant solution.

Adarsh also discussed four types of companies [1] wishful thinking (you have enough deep connections to get angel funding), [2] historical precedence - selling technology to improve a process, [3] intuitive jump - pure luck; with democratization of technology, YouTube and Ebay became a big thing even though video sharing and online auctions were almost non-existent web services during their time, and [4] sure technology - you know that there is a need for it in the future (e.g. Y2K “bug”).

Follow-up events to this is an Entrepenuerial Brainstorming Session with GSB and CS students and an Introduction to creating application on the iPhone. Apple’s development platform makes it so easy for anyone to distribute an app and sell it over iTunes (or AppleStore?) enabling you to earn several thousand dollars in a few months.

Oh, and they had free pizza during the talk :)

Related Posts Related Websites

Great Chicago Book Sale

I quickly grabbed my bike after coming from a seminar class and arrived 10 minutes before the closing time! Within a short span of time and by relying on my semi-rare impulsiveness of buying, I got these two titles foer 5 USD (buy-one-take-one):

W. T. Welford, Useful Optics (Chicago Lectures in Physics). University Of Chicago Press, October 1991.

Students and professionals alike have long felt the need of a modern source of practical advice on the use of optical tools in scientific research. Walter T. Welford’s _Useful Optics_ meets this need. Welford offers a succinct review of principles basic to the construction and use of optics in physics. His lucid explanations and clear illustrations will particularly help those whose interests lie in other areas but who nevertheless must understand enough about optics to create the experimental apparatus necessary to their research. Consistently emphasizing applications and practical points of design, Welford covers a host of topics: mirrors and prisms, optical materials, aberration, the limits of image formation and resolution, illumination for image-forming systems, laser beams, interference and interferometry, detectors and light sources, holography, and more. The final chapter deals with putting together an experimental optics system. Many areas of the physical sciences and engineering increasingly demand an appreciation of optics. Welford’s _Useful Optics_ will prove indispensable to any researcher trying to develop and use effective optical apparatus. Walter T. Welford (1916-1990) was professor of physics at Imperial College of Science, Technology and Medicine from 1951 until his death. He was a Fellow of the Royal Society and of the Optical Society of America.  Link to [Amazon.com]

T. P. Hughes, Human-Built World: How to Think about Technology and Culture (science * culture).    University Of Chicago Press, May 2005.

To most people, technology has been reduced to computers, consumer goods, and military weapons; we speak of “technological progress” in terms of RAM and CD-ROMs and the flatness of our television screens. In Human-Built World, thankfully, Thomas Hughes restores to technology the conceptual richness and depth it deserves by chronicling the ideas about technology expressed by influential Western thinkers who not only understood its multifaceted character but who also explored its creative potential.

Hughes draws on an enormous range of literature, art, and architecture to explore what technology has brought to society and culture, and to explain how we might begin to develop an “ecotechnology” that works with, not against, ecological systems. From the “Creator” model of development of the sixteenth century to the “big science” of the 1940s and 1950s to the architecture of Frank Gehry, Hughes nimbly charts the myriad ways that technology has been woven into the social and cultural fabric of different eras and the promises and problems it has offered. Thomas Jefferson, for instance, optimistically hoped that technology could be combined with nature to create an Edenic environment; Lewis Mumford, two centuries later, warned of the increasing mechanization of American life.

Such divergent views, Hughes shows, have existed side by side, demonstrating the fundamental idea that “in its variety, technology is full of contradictions, laden with human folly, saved by occasional benign deeds, and rich with unintended consequences.” In Human-Built World, he offers the highly engaging history of these contradictions, follies, and consequences, a history that resurrects technology, rightfully, as more than gadgetry; it is in fact no less than an embodiment of human values. Link to [Amazon.com]

Even information can be found in the UChicago Press site.

Related Posts Related Websites

Midwest Grid School 2008 @UChicago

Yey! Malapit na malapit lang :) More information about the program and registration information is found in its OSG Page. Workshop dates from Sept 17-19.

The Open Science Grid (OSG), the TeraGrid and the Computation Institute of The University of Chicago present a three day intensive course in grid computing and its application to scientific discovery.

The course introduces the techniques of grid and distributed computing for science and engineering fields, with hands-on training in the use of national grid computing resources. The workshop introduces essential skills that will be needed by researchers in the natural and applied sciences, engineering, and computer science to conduct and support large-scale computation and data analysis in emerging grid and distributed computing environments.

School participants will work with grid computing experts during the 3-day training. The workshop will focus on enabling the use of the national cyberinfrastructure -The Open Science Grid and TeraGrid- to perform large-scale computations and data-intensive processing in the your field of research. Participants will learn to use grids of thousands of processors and will be able to continue to use these resources for their research after the class. We encourage you to bring your research project to us for suggestions and help in porting your application to the grid. We would like to offer you support in transitioning your application to this platform.

Related Posts Related Websites