Monday, December 12, 2011

The Cruise Ships are to Blame, Right! Right?

I swim in the San Francisco Bay, naked (without a wetsuit). And I belong to a swim club, next door to another such club, filled with like swimmers and rowers. We are very conscientious about water quality in the bay as its our pool and a special place.

In September, a thread sprung up on my swim club’s listserv about our swim hole being ‘posted’, which bluntly put it means there’s poop in the water -- unsafe levels of contamination. This is clearly a less than ideal situation for swimming. While the thread started by examining and trying to interpret the water quality readings, posters quickly turned on docking cruise ships and named them the culprits.

The first few posts were data focused. It began with an email forwarded from the San Francisco Public Utilities Commission (SFPUC) detailing the unhealthy levels of EColi and Enterococcus and led to a series of questions trying to reconcile the email with the data the SFPUC posted to San Francisco’s public data site datasf.org (http://bit.ly/sjfz6O). Initially, respondents were unsure of how to interpret the SFPUC’s water readings: How fresh is the data? Why are the warnings they sent via email different from the data on their website? What is the lag time between when readings are taken and when they are posted? Does this affect our understanding of the water quality?

Even before the questions were answered, it was blame time. It’s the Cruise Ships: “Is it a coincidence that the huge cruise ship just docked at Pier 29? I think not.... (ewwww.....)”.

Me, I’m a data person. I was still interested in understanding how the data was collected and how valid it was. Luckily, by replying to the list I was able to find the Cruise Ship Schedule and within an hour render a viz that shut down the rumor mill.



Based on the data, there was no clear correlation between cruise ship activity and unsafe water quality readings. There were an equal number of weeks in which two ships departed that had failing readings as there were that did not. No single cruise line, last port of departure or departing berth showed any significant correlation either.






What was clear from the data was that only Aquatic Park and Jackrabbit beach had EColi closures, which points to failing drainage or other spillage that drained into the bay.

It would have been great to identify the offending cruise ship company. I settled for returning the conversation to understanding the data and focusing on other sources for investigation: sewer pipes, runoff and local businesses. And then I sent this on to Deb at San Francisco Baykeeper, a non-profit "pollution watchdog" for the San Francisco Bay.  I learned that the contamination was likely linked to the Warf's dilapidated sewage system, which Baykeeper is working with the city to remedy.  Thank you SF Baykeeper!  Keep up the great work.

Calculating Age in MySQL

I was so thrilled with PostgreSQL's age function that I looked up the same for MySQL:

FLOOR((TO_DAYS(NOW())- TO_DAYS(date_of_birth)) / 365.25)

Wednesday, July 6, 2011

Greenplum / PostgreSQL Function of the Day: age

How old is my customer?  How old is that order?  How old is our product widget?

When doing customer profiling and segmentation I often want to find a customer's age.  This is also useful for calculating the age of anything.  Luckily, PostgreSQL makes this easy and it's available in Greenplum 4.0.4.0: they have an age function.

The age(date) function will operate on dates through timestamps and return age in years all the way through to years, days, hours, minutes, and seconds.

For example, my sister was born on 7/4/1980.  If her date of birth was in my family table under date_of_birth I could calculate her age :

select age(date_of_birth) from family;

And the db would return 31.

If I wanted to ask, how long have I owned this asset, I could query the db:

select age(created_at) from assets where id = 11; 

And it would calculate the age from this minute as:

age_of_asset                        created_at
3 years 10 mons 14 days 14:59:15    2007-08-22 09:00:45