Categories
Uncategorized

Science 2.0: Choosing Infrastructure and Testing Tools for Scientific Software Projects

Titus Brown at the podium at MaRSC. Titus Brown delivering his presentation.

Here’s the first of my notes from the Science 2.0 conference, a conference for scientists who want to know how software and the web is changing the way they work. It was held on the afternoon of Wednesday, July 29th at the MaRS Centre in downtown Toronto and attended by 102 people. It was a little different from most of the conferences I attend, where the primary focus is on writing software for its own sake; this one was about writing or using software in the course of doing scientific work.

This entry contains my notes from C. Titus Brown’s presentation, Choosing Infrastructure and Testing Tools for Scientific Software Projects. Here’s the abstract:

The explosion of free and open source development and testing tools offers a wide choice of tools and approaches to scientific programmers.  The increasing diversity of free and fully hosted development sites (providing version control, wiki, issue tracking, etc.) means that most scientific projects no longer need to self-host. I will explore how three different projects (VTK/ITK; Avida; and pygr) have chosen hosting, development, and testing approaches, and discuss the tradeoffs of those choices.  I will particularly focus on issues of reliability and reusability juxtaposed with the mission of the software.

Here’s a quick bio for Titus:

C. Titus Brown studies development biology, bioinformatics and software engineering at Michigan State University, and he has worked in the fields of digital evolution and physical meteorology. A cross-cutting theme of much of his work has been software development for computational science, which has led him to software testing and agile software development practices. He is also a member of Python Software Foundation and the author of several widely-used Python testing toolkits.

  • Should you do open source science?
    • Ideological reason: Reproducibility and open communication are supposed to be at the heart of good science
    • Idealistic reason: It’s harder to change the world when you’re trying to do good science and keep your methods secret
    • Pragmatic reason: Maybe having more eyes on your project will help!
  • When releasing the code for your scientific project to the public, don’t worry about which open source licence to use – the important thing is to release it!
  • If you’re providing a contact address for your code, provide a mailing list address rather than your own
    • It makes it look less “Mickey Mouse” – you don’t seem like one person, but a group
    • It makes it easy to hand off the project
    • Mailing lists are indexed by search engines, making your project more findable
  • Take advantage of free open source project hosting

 

  • Distributed version control
    • “You all use version control, right?” (Lots of hands)
    • For me, distributed version control was awesome and life-changing
    • It decouples the developer from the master repository
    • It’s great when you’re working away from an internet connection, such as if you decide to do some coding on airplanes
    • The distributed nature is a mixed mixed blessing
      • One downside is "code bombs", which are effective forks of the project, created when people don’t check in changes often enough
      • Code bombs lead to complicated merges
      • Personal observation: the more junior the developer, the more they feel that their code isn’t “worthy” and they hoard changes until it’s just right. They end up checking in something that’s very hard to merge
    • Distributed version control frees you from permission decisions – you can simply say to people who check out your code "Do what you want. If I like it, I’ll merge it."

 

  • Open source vs. open development
    • Do you want to simply just release the source code, or do you want participation?
      • I think participation is the better of the two
    • Participation comes at a cost, in both support time and attitude
      • There’s always that feeling of loss of control when you make your code open to use and modification by other people
      • Some professors hate it when someone takes their code and does "something wrong" with it
      • You’ll have to answer “annoying questions” about your design decisions
      • Frank ("insulting") discussion of bugs
      • Dealing with code contributions is time-consuming – it takes  time to review them
    • Participation is one of the hallmarks of a good open source project

 Slide: "The Stunning Realization"

  • Anecdote
  • I used to work on the “Project Earthshine” climatology project
    • The idea behind the project was to determine how much of the sunlight hitting the Earth was being reflected away
    • You can measure this be observing the crescent moon: the bright part is lit directly by the sun; the dark part is also lit – by sunlight reflected from the Earth
    • You can measure the Greenhouse Effect this way
    • It’s cheaper than measuring sunlight reflected by the Earth directly via satellite
  • I did this work at Big Bear Lake in Califronia, where they hung telescopes to measure this effect at solar observatories
  • I went through the the source code of the application they were using, trying to figure out what grad student who worked on it before me did
  • It turned out that to get “smooth numbers” in the data, his code applied a correction several times
  • His attitude was that there’s no such thing as too many corrections
  • "He probably went on to do climate modelling, and we know how that’s going"
  • How do we know that our code works?
    • We generally have no idea that our code works, all we do is gain hints
    • And what does "works" mean anyway, in the context of research programming? Does it means that it gives results that your PI expects?
  • Two effects of that Project Earthshine experience:
  • Nowadays, if I see agreement between 2 sources of data, I think at least one of them must be wrong, if not both
  • I also came to a stunning realization that:
    • We don’t teach young scientists how to think about software
    • We don’t teach them to be suspicious of their code
    • We don’t teach them good thought patterns, techniques or processes
    • (Actually, CS folks don’t teach this to their students either)
  • Fear is not a sufficient motivator: there are many documented cases where things have gone wrong because of bad code, and they will continue to do so. Famous cases include:
  • If you’re throwing out experimental data because of ifs lack of agreement with your software model, that’s not a technical problem, that’s a social problem!

 

  • Automated testing
    • The basic idea behind automated testing is to write test code that runs your main code and verifies that the behaviour is expected
    • Example – regression test
      • Run program with a given set of parameters and record the output
      • At some later time, run the same program with the same parameters and record the output
      • Did the output change in the second run, and if so, do you know why?
      • This is different thing from "is my program correct"
      • If results change unintentionally, you should ask why
    • Example – functional test
      • Read in known data
      • Check that the known data matches your expectations
      • Does you data loading routine work?
      • It works best if you also test with "tricky" data
    • Example – assertions
      • Put "assert parameter >=0" in your code
      • Run it
      • Do I ever pass garbage into this function?
      • You’ll be surprised that things that "should never happen", do happen
      • Follow the classic Cold War motto: “Trust, but verify”
    • Other kinds of automated testing (acceptance testing, GUI testing), but they don’t usually apply to scientists
    • In most cases, you don’t need to use specialized testing tools
    • One exception is a code coverage tool
      • Answers the question “What lines of code are executed?”
      • Helps you discover dead code branches
      • Guide test writing to untested portions of code
    • Continuous integration
      • Have several "build clients" building your software, running tests and reporting back
      • Does my code build and run on Windows?
      • Does my code run under Python 2.4? Debian 3.0? MySQL 4?
      • Answers the question: “Is there a chance in hell that anyone else can use my code?”
    • Automated testing locks down "boring" code (that is, code you understand)
      • Lets you focus on "interesting" code – tricky code or code you don’t understand
      • Freedom to refactor, tinker, modify, for you and others

C. Titus Brown delivering his presentation at MaRS 

  • If you want to suck people into your open source project:
    • Choose your technology appropriately
    • Write correct software
    • Automated testing can help
  • Closed source science is not science
    • If you can’t see the code, it’s not falsifiable, and if it’s not falsifiable, it’s not science!
Categories
Uncategorized

Toronto Coffee and Code: Friday, July 31st at the Dark Horse Cafe (215 Spadina)

coffee_and_code_3

If it’s Friday, it must be time for another Toronto Coffee and Code! This one will take place at the usual location – the Dark Horse Cafe, 215 Spadina – and will run from 1 p.m. to 6 p.m..

coffee_and_code_1

Coffee and Code is my Friday afternoon ritual (a phrase that my classmates at Crazy Go Nuts University will find hauntingly familiar) in which I work out of a cafe and announce that I’ll be there. I’m making myself available as both a Developer Evangelist working for Microsoft Canada and a member of the Toronto Tech Community to answer your questions, take your comments, bounce ideas off or just chat with. Come on down, have a coffee (or tea, or juice) and say hi!

coffee_and_code_2

Categories
Uncategorized

My Statement on IE6

Yes, I know that cats live longer, but I think the quip I made at DemoCamp 21 still makes a good point:

Picture of "Bill the Cat" from "Bloom County" captioned with "If you got a cat when IE6 came out, it's dead now."

Let’s upgrade to compliant up-to-date browsers, shall we? IE8, or even that hippie browser, if you must.

Credit where credit is due: The “cat’s dead now” line is my remix of a line from a review of the Guns ‘N’ Roses concert that took place here in Toronto a couple of years back. The original line went something like “If you got a cat when Appetite for Destruction came out, it’s dead now.”

Categories
Uncategorized

The “Race to Market Challenge” for Windows Mobile

This article also appears in Canadian Developer Connection.

The Race to Market Challenge

Here’s a quick little video that explains what the just-announced Race to Market challenge is all about:

If you’ve been thinking about developing for Windows Mobile, now’s the time! We’re now accepting submissions of applications for Windows Marketplace for Mobile, the on-phone store where people with Windows Mobile phones can buy and install mobile applications easily. Better still, we’re making it a contest – submit your Windows Mobile app between now and 11:59 p.m. on December 31st and you’ll automatically be entered in the Race to Market Challenge where you’ll have a chance to win one of 4 Surface tables (developer edition, of course) like the one pictured below with the dashing Developer Evangelist…

surface_pdc

…along with a lot of online marketing and promotion for your application and a really cool trophy.

Winning applications will fall into one of these categories:

  • Most downloaded
  • Most valuable (where “value” is the number of downloads multiplied by the price)
  • Most useful, as judged by a Microsoft panel
  • Most playful, as judged by a Microsoft panel

The Race to Market Challenge runs from now until December 31st, and the sooner you get started, the more likely you shot at one of the grand prized. For full details about the contest, visit mobilethisdeveloper.com.

Getting Started with Windows Mobile Development

Between now and the end of the contest, I’ll be posting articles on Windows Mobile development and the Race to Market Challenge. In the meantime, here are some tips that should help you get started.

What You Need

Here’s a snippet from an earlier article of mine that shows you what you need in order to get started with Windows Mobile development. In order to build an application for Windows Mobile 6, you’ll need the following things:

Visual Studio 2008, Professional Edition or higher
visual_studio_2008_pro
This is the development environment. It’s not the only one that you can use to develop Windows Mobile apps, but it’s the one we’re using.

You can also use Visual Studio 2005 – if you do so, Standard Edition or higher will do. If you don’t have Visual Studio, you can download a trial version of Visual Studio 2008.
 

The Windows Mobile 6 SDKs
gear_icon
 
The Windows Mobile 6 SDKs contain the templates for building Windows Mobile 6 projects and emulators for various Windows mobile phones.

There are two such SDKs to choose from:

  • The Standard SDK. The general rule is that if the device doesn’t have a touch screen, its OS is Windows Mobile 6 Standard, and this is the SDK for developing for it.
  • The Professional SDK. The general rule is that if the device has a touch screen, its OS is Windows Mobile 6 Professional, and this is the SDK for developing for it.

    I recommend downloading both SDKs. You never know where you’ll deploy! 

  • .NET Compact Framework 3.5 Redistributable
    dotnet_logo
     
    The .NET Compact Framework 3.5 Redistributable is the version of the .NET framework for mobile devices. It only needs to be sent to the device once.
    A Windows Mobile 6 Device
    palm_treo_pro
     
    You can get by in the beginning with just the emulators, but you’ll eventually want to try out your app on a real phone. I’m using my phone, a Palm Treo Pro.

    As the saying goes, “In theory, there is no difference between theory and practice; in practice, there is.”

    The mobile device syncing utility that works with your operating system
    windows_mobile_device_center_icon
    If you’ve got a Windows Mobile 6 device, you’ll need the application that connects your mobile phone to your OS:

  • For Windows 7 and Vista, use Windows Mobile Device Center.
  • For Windows XP and Server 2003, use ActiveSync.
  • Previous Articles on Windows Mobile Development

    Here are links to my earlier articles on Windows Mobile development:

    I’ll be posting more soon, but these should help you get up and running in the meantime.

    If you’ve got any questions or comments about Windows Mobile development or the Race to Market Challenge, feel free to drop me a line or leave a note in the comments!

    Categories
    Uncategorized

    A Busy Week

    This article also appears in Canadian Developer Connection.

    It’s gonna be a busy week for me — there’s a lot going on!

    Damian Conway

    Monday: Damian Conway and The Missing Link

    On Monday evening, I’ll be catching Damian Conway’s presentation, The Missing Link. There’s nothing quite like a Damian Conway presentation – they’re equal parts computer science, mathematical digression, history lesson, physics lecture, pop-culture observation, Perl module code walkthrough and stand-up comedy routine.

    If you’re up for an entertaining and enlightening presentation by one of the bright lights of the open source world and you’re going to be in Toronto tonight, you should catch this one. There’s no charge for admission and no registration process – just show up at University of Toronto’s Bahen Centre for Information Technology (40 St. George Street, west side, just north of College) at 7:00 p.m. and head to room 1160 (the big lecture theatre near the back of the first floor).

    Map picture

    Tuesday: DemoCamp 21 with Special Guest John Udell

    DemoCamp Toronto 21: Tuesday, July 28th Tuesday evening brings the 21st edition of DemoCamp, which I like to describe as “show and tell for the bright lights of the Toronto-area tech community”. It’s a chance for people, from hobbyists working on a pet project to enterprise software developers building something globe-spanning to show their peers their projects in action or share an idea. It’s put together by my fellow Microsoftie David Crow (who’s also in Microsoft Canada’s Developer and Platform Evangelism group); I cost-host the event with Jay Goldman.

    This one’s going to be a special one for a couple of reasons. Firstly, this will be the first DemoCamp held at the Rogers Theatre. Second, Jon Udell, Microsoft Tech Evangelist extraordinaire, will be there.

    The presentations on the schedule are:

    • You can’t pick your neighbours, but you can pick your neighbourhood!
      Saul Colt, Zoocasa
    • ArtAnywhere : Where Lost artwork meets Empty walls
      Christine Renaud, ArtAnywhere
    • Bringing Social Media to Contractors
      Brian Sharwood, HomeStars
    • Create a BlackBerry/iPhone Mobile App in 5 Minutes
      Alan Lysne, Cascada Mobile
    • Stories Told Together – Introducing Social Cards
      Shaun, MacDonald, MashupArts
    • WeGoWeGo.com: semantic search for city events
      Dan Wood, WeGoWeGo.com
    • Guestlist – online event management
      Ben Vinegar, Guestlist
    • guiGoog: Advanced Visual Power Search
      Jason Roks, GuiGoog

    Alas, this event is sold out. I’ll take notes and post them on this blog.

    Wednesday: Science 2.0

    what_we_need_more_of_is_science

    The Science 2.0 conference takes place on Wednesday afternoon. Its topic: how the web and computers can radically change and improve science. It takes place at the MaRS Centre and the presentations are:

    • Choosing Infrastructure and Testing Tools for Scientific Software Projects
      Titus Brown
    • A Web Native Research Record: Applying the Best of the Web to the Lab Notebook
      Cameron Neylon
    • Doing Science in the Open: How Online Tools are Changing Scientific Discovery
      Michael Nielsen
    • Using “Desktop” Languages for Big Problems
      David Rich
    • How Computational Science is Changing the Scientific Method
      Victoria Stodden
    • Collaborative Curation of Public Events
      Jon Udell

    As with DemoCamp, this event is a popular one and is sold out. I’ll take notes and blog the conference.

    Thursday: Windows 7 Blogger Event

    I’ll be helping out at a gathering of Toronto bloggers on Thursday, where we’ll be showing them Windows 7.

    Friday: Coffee and Code

    coffee-and-code-2 If it’s Friday, it must be time for Toronto Coffee and Code! It’s the day when I set up shop at a cafe – usually the Dark Horse – and work from there, making myself available to answer questions, hear your opinions and comments and chat. I’ll talk about Microsoft, our tools and tech, the industry in general, whatever!

    This Friday’s Toronto Coffee and Code will take place at the Dark Horse Cafe (215 Spadina) from 1 p.m. to 6 p.m.. Feel free to drop by!

    Map picture

    Other Stuff Going On This Week

    techdays_canada_2009_logo

    • Along with the other people on the team, I’m helping out with the preparatory work on the TechDays conference, which will be taking place in seven cities across Canada this fall.
    • I’m also working on ongoing series of articles covering stuff like coding fundamentals, ASP.NET MVC, mobile and some other stuff that I have to keep on the down-low for the time being.
    • And it’s not too late for me to start working on the ASP.NET MVC presentation that I’m doing with ObjectSharp’s Barry Gervin at the Toronto edition of Stack Overflow’s DevDays conference in October.
    Categories
    Uncategorized

    The Sub-$1000 Opportunity?

    U.S. $1000 bill

    Here’s a thought experiment for you Windows developers out there: the fact that Apple pretty much owns the $1000+ computer market is in fact an opportunity. Discuss.

    (This article appears with slightly different wording – to try things from a different perspective – on the official Microsoft Canada developer blog, Canadian Developer Connection.)

    Categories
    Uncategorized

    The “CSS Is Awesome” Mug

    If you’ve worked with CSS and multiple browsers long enough, this mug will make you laugh (or cry):

    Mug with a square containing the text "CSS IS AWESOME", with the "AWESOME" extending beyond the boundary of the square.

    At US$12.95 (available at Zazzle.com), it’s a pretty good “Secret Santa” gift for the web developer or designer on your team.