Categories
Uncategorized

Do You Code in Visual Basic?

"Visual Basic" graphic from the old packaginOver at Canadian Developer Connection, the blog I’m actually paid to write for, I’m asking readers – presumably Canadian software developers who write code for Microsoft’s platforms – if they write code in Visual Basic. I’ve posted code in C# and (Iron)Ruby, plan to post code in (Iron)Python, F# and JavaScript, but none in Visual Basic.

Since you make an ass out of umption when you make an assumption, I thought’s I’d ask if the Canadian Developer Connection readership coded in VB, and thus far, the answers run the gamut from “no” to “only if I have to”.

Categories
Uncategorized

Exceptions: The Airbags of Code

This article also appears in Canadian Developer Connection.

its_okay_i_wrote_an_exception

The trouble with a lot of example code covering exceptions is that the examples are often cases in which you shouldn’t be using an exception in the first place. Consider the classic known as “Someone’s trying to divide by zero” – here’s the C# version:

// C#

try
{
    result = dividend / divisor;
}
catch (DivideByZeroException ex)
{
    Console.WriteLine("Idiot.");
}

and here’s the Ruby version:

// Ruby (works in IronRuby too!)

begin
    result = dividend / divisor
rescue ZeroDivisionError
    puts "Idiot."
end

// You have to hand it to Ruby for picking great keywords for
// exception handling. While C# borrowed Java's "try / catch / finally",
// Ruby went with the more macho "begin / rescue / ensure".
// As Yoda himself would say: "Do or do not. There is no try."

The better approach would be to do a little defensive programming and make sure that divisor is non-zero before performing the division operation. So why do tutorials on exception handling almost always bring out the “Someone’s trying to divide by zero” example?

There are two reasons:

  • It’s simple. It’s only a handful of lines of code.
  • It’s predictable. Set the value of divisor to zero and the exception gets thrown. Always.

The truly exceptional exceptions — I/O errors, timeouts and other things that cause exceptions are a little harder to set up and take more code to handle. Hence the divide-by-zero example; it illustrates try and catch (or rescue in a Ruby block) in a way even the newest newbie can understand.

The problem is that many tutorial authors don’t get any deeper than simply explaining the keywords with simple examples, leading people to misuse exceptions, either as a substitute for checking for preconditions or as an unstructured form of flow control in the style of the much-maligned goto (which in many cases is considered harmful).

Like goto, exceptions are unstructured jumps, which make your program’s flow more complex. Unlike goto, exceptions are computationally “expensive” because of all the extra work involved in setting up and backtracking program flow that comes with a thrown exception.

A good guideline to follow is that exceptions are for exceptional cases. Stuff that you can’t easily predict. You can tell if a division operation is going to result in an undefined result – just look at the divisor! Harder to predict are things like whether a server access will time out or if the hard drive will decide that the moment you’re reading a file is the best possible time to corrupt it. Those hard-to-foresee, believed-to-be-rare, exceptional cases are really what exceptions are meant to handle.

Think of exceptions is as being like the airbags in your car. The idea is that they’re a last resort; they’re no substitute for defensive driving. (Like airbags, they’re also expensive to reset.)

Lee Dumond goes into further detail on the topic of defensive programming as being like defensive driving in an article titled Defensive Programming, or Why Exception Handling Is Like Car Insurance. He cites the “Someone’s trying to divide by zero” example, provides a list of defensive programming strategies that you should consider before coding up that exception handler and talks about those exceptional cases when you will have to use an exception. Check it out!

Categories
Uncategorized

Know Your Cat Ports

cat_ports

For more comics like this, see www.slowwave.com.

Categories
Uncategorized

Windows Mobile Case Study: Porting Amplitude to WinMo

HTC phone with Amplitude on screen (simulated)

The Windows Mobile Blog points to an MSDN article covering how Amplitude, an application for the iPhone, was ported to Windows Mobile.

Here’s a quick description of Amplitude, which is developed by Gripwire, a mobile and social app company based in Seattle, courtesy of the Windows Mobile Blog:

Amplitude picks up any sound in a user’s surroundings through the microphone and then amplifies the sound, rendering it into a rich graphical representation on the device. Amplitude can be used to amplify any sounds, such as human or animal heartbeats, that usually wouldn’t be picked up by the human ear. Amplitude provides a cool user interface featuring an oscilloscope that allows users to view and visually quantify, signal voltages, as you can see the volume of the sound that you are listening to.

The MSDN article on the Amplitude porting project covers a lot of ground, including:

Whether you’re thinking of expanding your iPhone application to other platforms or starting a new Windows Mobile app project, you’ll find this case study packed with useful information and links. I’m going to expand on some of the topics covered in the article in future posts on this blog.

And don’t forget – there’s the Race to Market Challenge, in which you’re automatically entered whenever you submit a mobile app to Windows Marketplace for Mobile. Here’s a quick reminder of what Race to Market is all about:

Categories
Uncategorized

Science 2.0: How Computational Science is Changing the Scientific Method

This article also appears in Canadian Developer Connection.

Victoria Stodden speaking at the Science 2.0 conference    

Here’s the third in a series of notes from the Science 2.0 conference, a conference for scientists who want to know how software and the web is changing the way they work. It was held on the afternoon of Wednesday, July 29th at the MaRS Centre in downtown Toronto and attended by 102 people. It was a little different from most of the conferences I attend, where the primary focus is on writing software for its own sake; this one was about writing or using software in the course of doing scientific work.

My previous notes from the conference:

This entry contains my notes from Victoria Stodden’s presentation, How Computational Science is Changing the Scientific Method.

Here’s the abstract:

As computation becomes more pervasive in scientific research, it seems to have become a mode of discovery in itself, a “third branch” of the scientific method. Greater computation also facilitates transparency in research through the unprecedented ease of communication of the associated code and data, but typically code and data are not made available and we are missing a crucial opportunity to control for error, the central motivation of the scientific method, through reproducibility.  In this talk I explore these two changes to the scientific method and present possible ways to bring reproducibility into today’ scientific endeavor. I propose a licensing structure for all components of the research, called the “Reproducible Research Standard”, to align intellectual property law with longstanding communitarian scientific norms and encourage greater error control and verifiability in computational science.

Here’s her bio:

Victoria Stodden is the Law and Innovation Fellow at the Internet and Society Project at Yale Law School, and a Fellow at Science Commons. She was previously a Fellow at Harvard’s Berkman Center and postdoctoral fellow with the Innovation and Entrepreneurship Group at the MIT Sloan School of Management. She obtained a PhD in Statistics from Stanford University, and an MLS from Stanford Law School.

The Notes

  • Research has been how massive computation has changed the practice of science and the scientific method
    • Do we have new modes of knowledge discovery?
    • Are standards of what we considered knowledge changing?
    • Why aren’t researchers sharing?
    • One of my concerns is facilitating reproducibility
      • The Reproducible Research Standard
      • Tools for attribution and research transmission
  • Example: Community Climate Model
    • Collaborative system simulation
    • There are community models available
    • Built on open code, data
    • If you want to model something a complex as climate, you need data from different fields
    • Hence, it’s open
  • Example: High energy physics
    • Enormous data produced at LHC at CERN — 15 petabytes annually
    • Data shared through grid
    • CERN director: 10 – 20 years ago, we might have been able to repeat an experiment – they were cheaper, simpler and on a smaller scale. Today, that’s not the case
  • Example: Astrophysics
    • Data and code sharing, even among amateurs uploading their photos
    • Simulations: This isn’t new: even in the mid-1930s, they were trying to calculate the motion of cosmic rays in Earth’s magnetic field via simulation
  • Example: Proofs
    • Mathematical proof via simulation vs deduction
    • My thesis was proof via simulation – the results were not controversial, but the methodology was

Victoria Stodden and her "Really Reproducible Research" slide

  • The rise of a “Third Branch” of the Scientific Method
    • Branch 1: Deductive/Theory: math, logic
    • Branch 2: Inductive/Empirical: the machinery of hypothesis testing – statistical analysis of controlled experiments
    • Branch 3: Large-scale extrapolation and prediction – are we gaining knowledge from computation/simulations, or they just tools for inductive reasoning?
    • Contention — is it a 3rd branch?
      • See Chris Anderson’s article, The End of Theory (Wired, June 2008)
      • Systems that explain the world without a theoretical underpinning?
      • There’s the “Hillis rebuttal”: Even with simulations, we’re looking for patterns first, then create hypotheses, the way we always have
      • Steve Weinstein’s idea: Simulation underlies both branches:
        • It’s a tool to build intuition
        • It’s also a tool to test hypotheses
      • Simulations let us manipulate systems you can’t fit in a lab
    • Controlling error is central to scientific process

Victoria Stodden at Science 2.0 and her "Top reasons not to share" slide

  • Computation is increasingly pervasive in science
    • In the Journal of the American Statistical Association (JASA):
      • In 1996: 9 out of 20 articles published were computational
      • In 2006: 33 out 35 articles published were computational
  • There’s an emerging credibility crisis in computational science
    • Error control forgotten? Typical scientific computation papers don’t include code and data
    • Published computational science is near impossible to replicate
    • JASA June 1996: None of the computational papers provided any code
    • JASA June 2006: Only 3 out of the 33 computational articles made their code publicly available
  • Changes in scientific computation:
    • Internet: Communication of all computational research details and data is possible
    • Scientists often post papers but not their complete body of research
    • Changes coming: Madagascar, Sweave, individual efforts, journal requirements
  • A potential solution: Really reproducible research
    • The idea of an article as not being  the scholarship, but merely the advertisement of that scholarship
  • Reproducibility: can a member of the field independently verify the result?

Victoria Stodden at Science 2.0, with her "Controlling error" slide

  • Barriers to sharing
    • Took a survey of computational scientists
    • My hypotheses, based on the literature of scientific sociology:
      • Scientists are primarily motivated by personal gain or loss
      • Scientists are primarily worried about being “scooped”
  • Survey:
    • The people I surveyed were from the same subfield: Machine learning
    • They were American academics registered at a top machine learning conference (NIPS)
    • Respondents: 134 responses from 638 requests (23%, impressive)
    • They were all from the same legal environment of American intellectual property
  • Based on comments, it’s in the back of people’s minds
    • Reported sharing habits
      • 32% put their code available on the web
      • 48% put their data
      • 81% claimed to reveal their code
      • 84% said their data was revealed
      • Visual inspection of their sites revealed:
        • 30% had some code posted
        • 20% had some data posted
  • Preliminary findings:
    • Surprising: They were motivated to share by communitarian ideals
    • Surprising: They were concerned about copyright issues
  • Barriers to sharing: legal
    • The original expression of ideas falls under copyright by default
    • Copyright creates exclusive right of author to:
      • Reproduce work
      • Prepare derivative works
  • Creative Commons
    • Make it easier for artists to share and use creative works
    • A suite of licences that allows the author to determine the terms
    • Licences:
      • BY (attribution)
      • NC (non-commercial)
      • ND (no derived work)
      • SA (share-alike)
  • Open Source Software Licencing
  • Creative Commons follows the licencing approach used for open source software, but adapted for creative works
  • Code licences:
    • BSD licence: attribution
    • GPL: attribution and share-alike
  • Can this be applied to scientific work?
  • The goal is to remove copyright’s block to fully reproducible research
  • Attach a licence with an attribution to all elements of the research compendium

Victoria Stodden at the Science 2.0 conference and her "Real and Potential Wrinkles" slide

  • Proposal: Reproducible research standard
    • Release media components (text, data) under CC BY
    • Code: Modified BSD or MIT (attrib only)
  • Releasing data
    • Raw facts alone are generally not copyrightable
    • Selection or arrangement of data results in a protected compilation only if the end result is an original intellectual creation (US and Canada)
    • Subsequently qualified: facts not copied from another source can be subject to copyright protection
  • Benefits of RRS
    • Changes the discussion from "here’s my paper and results" to "here’s my compendium”
    • "Gives funders, journals and universities a “hook”
    • If your funding is public, so should your work!
    • Standardization avoids licence incompatibiltiies
    • Clarity of rights beyond fair use
    • IP framework that supports scientific norms
    • Facilitation of research, thus citation and discovery
  • Reproducibility is Subtle
    • Simple case: Open data and small scripts. Suits simple definition
    • Hard case: Inscrutable code; organic programming
    • Harder case: Massive computing platforms, streaming sensor data
    • Can we have reproducibility in the hard cases?
    • Where are acceptable limits on non-reproducibility?
      • Privacy
      • Experimental deisgn
    • Solutions for harder cases
      • Tools
  • Openness and Taleb’s criticism
    • Scientists are worried about contamination by amateurs
    • Also concerned about the “Prisoner’s dilemma”: they’re happy to share their work, but not until everyone else does
Categories
Uncategorized

Science 2.0: A Web Native Research Record – Applying the Best of the Web to the Lab Notebook

This article also appears in Canadian Developer Connection.

Cameron Neylon and his "Creative Commons" slide at Science 2.0

Intro

Here’s the second of my notes from the Science 2.0 conference, a conference for scientists who want to know how software and the web is changing the way they work. It was held on the afternoon of Wednesday, July 29th at the MaRS Centre in downtown Toronto and attended by 102 people. It was a little different from most of the conferences I attend, where the primary focus is on writing software for its own sake; this one was about writing or using software in the course of doing scientific work.

My previous notes from the conference:

This entry contains my notes from Cameron Neylon’s presentation, A Web Native Research Record – Applying the Best of the Web to the Lab Notebook.

Here’s the abstract:

Best practice in software development can save researchers time and energy in the critical analysis of data but the same principles can also be applied more generally to recording research process. Successful design patterns on the web tend to be those that successfully couple people into efficient information transfer mechanisms. Can we re-think the way we create, keep, and share our research records by using these design patterns to make it more effective?

Here’s Cameron’s bio:

Cameron Neylon is a biophysicist who has always worked in interdisciplinary areas and is a leading advocate of data availability. He currently works as Senior Scientist in Biomolecular Sciences at the ISIS Neutron Scattering facility at the Science and Technology Facilities Council. He writes and speaks regularly on the interface of web technology with science and is well-known as one of the leading proponents of open science.

The Notes

  • Feel free to copy and remix this presentation – it’s licenced under Creative Commons

 

  • What is the web good for?
    • Publishing
    • Subscribing
    • Syndicate
    • Remix, mash up and generally do stuff with
    • Collaborate
  • What do scientists do?
    • Publish
    • Syndicate (CRC books are a form of syndication)
    • Remix (take stuff from different disciplines — pull things to toghter, remix them
    • Validate
    • Collaborate
  • So, with this overlap, the web has solved science problems, right?
    • No — papers are dead, broken and disconnected
      • Papers don’t have links
      • The whole scientific record is fundamentally a dead document
    • The links between things make the web go round
    • I want to make science less like a great big monolithic document and make it more like a network of pieces of knowledge, wired together:
      • Fragments of science
      • Loosely coupled
      • Tightly wired

Cameron Neylon and his "Fragments of science / Loosely coupled / Tightly wired" slide at Science 2.0

  • What is a “fragment of science”?
    • A paper is too big a piece, even if it is the "minimal publishable unit"
    • A tweet is too small
    • A blog post would be the right size
  • His lab book is a collection of various electronic documents:
    • Excel files
    • Some basic version control
    • Data linked back to description of process used to create the data
    • As far as possible, the blogging is done automatically by machines
    • It doesn’t have to be complicated
  • [Shows a scatter plot, with each point representing an experiment]:
    • Can we tell an experiment didn’t work by its position on the graph?
    • We can tell which experiments weren’t recorded properly – they have no links to other experiments
  • The use of tagging and “folksonomies” goes some way, but how do you enforce it?
    • Tags are Inconsistent — not just between people, but even within a single person – you might tag the same thing differently from day to day
    • Templates create a virtuous circle, a self-assembling ontology
    • We found that in tagging, people were mixing up process and characteristics – this tells us something about the ontology process

Cameron Neylon and his "Physical objects / Digital objects" slide at Science 2.0

  • Put your data in external services where appropriate
    • Flickr for images
    • YouTube for video
    • RCSBPDB Protein Data Bank
    • Chemspider
    • Even Second Life can be used as a graphing medium!
    • All these services know how to deal with specific data types
  • Samples can be offloaded
    • LIMS, database, blogs, wiki, spreadsheet
    • Procedures are just documents
    • Reuse existing services
    • Semantic feed of relationships — harness Google: most used is the top result
  • Semantic web creates UI issues
    • Just trying to add meaning to results is one step beyond what scientists are expected to do
    • We need a collaborative document environment
    • The document environment must feel natural for people to work in
    • When they type something relevant, the system should realize that and automatically link it
    • We’re at the point where doc authoring systems can use regular expressions to recognize relevant words and autolink them

Cameron Neylon and his "Open" slide at Science 2.0

  • The current mainstream response to these ideas is:
    • The gamut from "You mean facebook?" to horror
    • I’m not worried about these ideas not getting adopted
  • Scientists are driven by impact and recognition
    • How do we measure impact?
      • Right now, we do this by counting the number of papers for which you’re an author
      • Most of my output is not published in traditional literature; it’s published freely on the web for other people to use
      • If they’re not on the web, they disappear from the net
      • The future measure of your scientific impact will be its effect on the global body of knowledge
      • Competition will drive adoption
Categories
Uncategorized

Barbara Liskov, Interviewed

This article also appears in Canadian Developer Connection.

Barbara Liskov The Interview

Over at the IT Manager Connection blog, there’s an interview with Barbara Liskov, who is:

  • The Ford Professor of Engineering at MIT’s Electrical Engineering and Computer Science Department
  • An Institute Professor at MIT
  • The first woman in the United States to earn a Ph.D. in computer science
  • An ACM Turing Award Recipient for both 2008 and 2009
  • An IEEE John von Neumann Medial Recipient for 2004
  • An ACM and American Academy of Arts and Sciences Researcher
  • …and most relevant to us, the “Liskov” in the Liskov Substitution Principle, one of the five SOLID principles for object-oriented design.

In the interview, Barbara talks about winning “the Nobel Prize of computing”, her vision for computing, what got her interested in computers, the challenges that the field still presents to minorities, the work she’s done and her thoughts on up-and-coming tech. If you’d like to listen, here’s the MP3 of Stephen Ibaraki interviewing Barbara Liskov. Stephen also wrote an article containing an abbreviated transcript that appears in IT Manager Connection. Enjoy!

The Liskov Substitution Principle

Small Liskov Substitution Principle poster

In case you’ve forgotten (or perhaps never learned), the Liskov Substitution Principle is:

If for each object o1 of type S there is an object o2 of type T such that for all
programs P defined in terms of T, the behavior of P is unchanged when o1 is
substituted for o2 then S is a subtype of T.

Well, duh. Who didn’t know that?

Object guru Robert C. “Uncle Bob” Martin took this bit of math nerd-speak and paraphrased in a way making it somewhat easier to follow:

Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it.

And because I’m nowhere near as smart as Uncle Bob, here’s the way I like to cover it:

If MySubclass is a subclass of MyClass, you should be able to replace instances of MyClass with MySubclass without breaking anything. Sort of like when they changed the actors who played "Darren" in Bewitched or "Becky" in Roseanne.

(Unlike Liskov or Martin, I don’t have to write academic papers, so I can get away with making references to old TV shows.)

As I mentioned earlier, I’ll be writing more about the SOLID principles. Watch this space!