I swim in the San Francisco Bay, naked (without a wetsuit). And I belong to a swim club, next door to another such club, filled with like swimmers and rowers. We are very conscientious about water quality in the bay as its our pool and a special place.
In September, a thread sprung up on my swim club’s listserv about our swim hole being ‘posted’, which bluntly put it means there’s poop in the water -- unsafe levels of contamination. This is clearly a less than ideal situation for swimming. While the thread started by examining and trying to interpret the water quality readings, posters quickly turned on docking cruise ships and named them the culprits.
The first few posts were data focused. It began with an email forwarded from the San Francisco Public Utilities Commission (SFPUC) detailing the unhealthy levels of EColi and Enterococcus and led to a series of questions trying to reconcile the email with the data the SFPUC posted to San Francisco’s public data site datasf.org (http://bit.ly/sjfz6O). Initially, respondents were unsure of how to interpret the SFPUC’s water readings: How fresh is the data? Why are the warnings they sent via email different from the data on their website? What is the lag time between when readings are taken and when they are posted? Does this affect our understanding of the water quality?
Even before the questions were answered, it was blame time. It’s the Cruise Ships: “Is it a coincidence that the huge cruise ship just docked at Pier 29? I think not.... (ewwww.....)”.
Me, I’m a data person. I was still interested in understanding how the data was collected and how valid it was. Luckily, by replying to the list I was able to find the Cruise Ship Schedule and within an hour render a viz that shut down the rumor mill.
Based on the data, there was no clear correlation between cruise ship activity and unsafe water quality readings. There were an equal number of weeks in which two ships departed that had failing readings as there were that did not. No single cruise line, last port of departure or departing berth showed any significant correlation either.
What was clear from the data was that only Aquatic Park and Jackrabbit beach had EColi closures, which points to failing drainage or other spillage that drained into the bay.
It would have been great to identify the offending cruise ship company. I settled for returning the conversation to understanding the data and focusing on other sources for investigation: sewer pipes, runoff and local businesses. And then I sent this on to Deb at San Francisco Baykeeper, a non-profit "pollution watchdog" for the San Francisco Bay. I learned that the contamination was likely linked to the Warf's dilapidated sewage system, which Baykeeper is working with the city to remedy. Thank you SF Baykeeper! Keep up the great work.
Online Workbook for Analytics & Data Goodness
Monday, December 12, 2011
Calculating Age in MySQL
I was so thrilled with PostgreSQL's age function that I looked up the same for MySQL:
FLOOR((TO_DAYS(NOW())- TO_DAYS(date_of_birth)) / 365.25)
FLOOR((TO_DAYS(NOW())- TO_DAYS(date_of_birth)) / 365.25)
Wednesday, July 6, 2011
Greenplum / PostgreSQL Function of the Day: age
How old is my customer? How old is that order? How old is our product widget?
When doing customer profiling and segmentation I often want to find a customer's age. This is also useful for calculating the age of anything. Luckily, PostgreSQL makes this easy and it's available in Greenplum 4.0.4.0: they have an age function.
The age(date) function will operate on dates through timestamps and return age in years all the way through to years, days, hours, minutes, and seconds.
For example, my sister was born on 7/4/1980. If her date of birth was in my family table under date_of_birth I could calculate her age :
And the db would return 31.
If I wanted to ask, how long have I owned this asset, I could query the db:
And it would calculate the age from this minute as:
When doing customer profiling and segmentation I often want to find a customer's age. This is also useful for calculating the age of anything. Luckily, PostgreSQL makes this easy and it's available in Greenplum 4.0.4.0: they have an age function.
The age(date) function will operate on dates through timestamps and return age in years all the way through to years, days, hours, minutes, and seconds.
For example, my sister was born on 7/4/1980. If her date of birth was in my family table under date_of_birth I could calculate her age :
select age(date_of_birth) from family;
And the db would return 31.
If I wanted to ask, how long have I owned this asset, I could query the db:
select age(created_at) from assets where id = 11;
And it would calculate the age from this minute as:
age_of_asset created_at 3 years 10 mons 14 days 14:59:15 2007-08-22 09:00:45
Thursday, September 30, 2010
If you're going to invest in the future you have to ask yourself, what's in the future? Lithium and coffee are good bets.
In my quest to better understand who Kiva loans to I looked into who gets the most money. Surprisingly I found it was Bolivia and Uganda. My first thoughts were, Why? I know almost nothing about Bolivia and even less about Uganda. I know which continent each are on, which is probably a leap over most Americans. Cynicism aside, why are they getting lots more money than other countries? What's going on in these countries that make them so appealing? Why am I asking this? Take a look (I apologize for the poorly rendered graphic but I cannot yet post a Tableau 6 public viz since it's still in beta and hence a bit skewed):
With my limited information, I could surmise that Kiva gives loans to countries that have lithium (Bolivia), for the lithium ion batteries in your laptops and hybrid cars, and coffee (Uganda) because high-end, delicious coffee is market valuable. But there must be more, right?
For Bolivia there is. They have a 0.00% delinquency and 0.00% default rate. Stellar! And their economy is growing respectably. In many ways, they've benefitted from the post-Cold War period: even after the West pulled their support in the late 1980s, Bolivia had the second largest reserve of natural gas in South America and now with the growing importance of lithium they have quite a leg to lean on.
But what about Uganda? Uganda is in the Top 5 for delinquency. So why does Kiva loan to them? It could be their vast coffee exports, 56% of GDP. Is that enough? Clearly this warrants more study.
And with all this, what's up with the Dominican Republic? They're in the Top 10 of direct loans from Kiva, but with a considerable default rate. What are they doing that affords them additional funding even with a high default rate?
More importantly, in emerging economies how can we replicate what Bolivia and the others in the Top Tier countries are doing especially with their miniscule default rate?
Stay tuned.
With my limited information, I could surmise that Kiva gives loans to countries that have lithium (Bolivia), for the lithium ion batteries in your laptops and hybrid cars, and coffee (Uganda) because high-end, delicious coffee is market valuable. But there must be more, right?
For Bolivia there is. They have a 0.00% delinquency and 0.00% default rate. Stellar! And their economy is growing respectably. In many ways, they've benefitted from the post-Cold War period: even after the West pulled their support in the late 1980s, Bolivia had the second largest reserve of natural gas in South America and now with the growing importance of lithium they have quite a leg to lean on.
But what about Uganda? Uganda is in the Top 5 for delinquency. So why does Kiva loan to them? It could be their vast coffee exports, 56% of GDP. Is that enough? Clearly this warrants more study.
And with all this, what's up with the Dominican Republic? They're in the Top 10 of direct loans from Kiva, but with a considerable default rate. What are they doing that affords them additional funding even with a high default rate?
More importantly, in emerging economies how can we replicate what Bolivia and the others in the Top Tier countries are doing especially with their miniscule default rate?
Stay tuned.
Monday, September 13, 2010
Where exactly are your Kiva micro loans going?
I love Kiva.org. I love the idea of Kiva (micro lending) and the practice is so easy it makes me smile (and make regular loans). In lieu of a Peet's cappuccino each day I can help a Peruvian woman buy a motorcycle to transport her family to the doctor, or to buy supplies to make clothes or to sell cosmetics, or to stock her grocery store. I'm not sure why Peruvian women are such a lure for me, but they are. And Nicaraguan women. It's women. I find myself wanting to help women in developing nations improve their lives. Women in Central and South America. Women in Africa. Women the wold over.
These are the entrepreneurs that I sponsor: the Peruvian and the Nicaraguan women. These are the women that I envision when making my loans.
These are the entrepreneurs that I sponsor: the Peruvian and the Nicaraguan women. These are the women that I envision when making my loans.
I made my first Kiva loan in February 2009. I think I heard about Kiva on NPR and was instantly hooked. I loved the idea of direct assistance and assistance to women in particular. Since then, I've made 10 loans, 3 to Nigerian women, 2 to Peruvian women, and 1 each to women in Nicaragua, Cambodia, Mexico, the Dominican Republic, and the Philippines. For some reason, I thought they were mostly Peruvian women but alas the data tells a different story.
Tonight I was wondering where Kiva sends its money and whether some countries are better than others at repaying their loans. And, with the miraculous wizardry of Tableau I was able to quickly see that there is a correlation between place and delinquency, place and loans disbursed. Mind you, I'm talking correlation and not causality. I don't have enough data to say why Kenya is the greatest defaulter of loans with the US trailing as a close second, or why Tanzania, Nicaragua and Ghana have delinquency rates exceeding 50%, which are considerably greater than any other country with active loans, but I can quickly see these details and that tells me there's something to investigate.
So what did I learn? Kiva makes lots of loans to Cambodia and Peru. And those borrowers, they're good for it. In fact, they're better than those in the US. Uganda gets less money and is a bigger risk. The countries who were the biggest risks got the least money. So, Kiva is spending my money well. I think. But what can we do to help those with high delinquency or default rates? It must be more than your country of origin that shapes your credit worthiness. Or so I would hope. What do you think?
Check out my data set and visualizations at http://public.tableausoftware.com/workbooks/Kiva_Org-MicrolendingRevealed
Tuesday, August 3, 2010
Stop Trying to Delight Your Customers and Simply Make it Easy for Them
This month's Harvard Business Review (HBR) article Stop Trying to Delight Your Customers argues that companies should focus on meeting their customers' needs rather than trying to provide over-the-top service to increase tenure and customer loyalty. The latter doesn't translate into increased loyalty and efforts toward it are usually at the expense of the former.
Customers notoriously punish bad service, particularly when left on hold, transferred while on a service call, or when they receive egregious customer service. We've all been there and we've all told two friends and two friends and so on. Customers share negative experiences at twice the rate of positive ones, according to a customer service and loyalty study from the Customer Contact Council. What's more, reducing customers' efforts builds loyalty, the vague notion of delighting them does not. Customers want their problem solved, not some whiz-bang, over the top widget. And finally, improved customer service, particularly in the call-to-contact realm does more to reduce churn than the "delight them" efforts. Maybe, the "delight them" focus should be on the customers' experience rather than delight in the company's product.
What are your thoughts on customer service and the customer experience? Do you want to be delighted or are you often disappointed because your core services needs are unmet?
Customers notoriously punish bad service, particularly when left on hold, transferred while on a service call, or when they receive egregious customer service. We've all been there and we've all told two friends and two friends and so on. Customers share negative experiences at twice the rate of positive ones, according to a customer service and loyalty study from the Customer Contact Council. What's more, reducing customers' efforts builds loyalty, the vague notion of delighting them does not. Customers want their problem solved, not some whiz-bang, over the top widget. And finally, improved customer service, particularly in the call-to-contact realm does more to reduce churn than the "delight them" efforts. Maybe, the "delight them" focus should be on the customers' experience rather than delight in the company's product.
What are your thoughts on customer service and the customer experience? Do you want to be delighted or are you often disappointed because your core services needs are unmet?
Tuesday, July 27, 2010
Calculating Tenure when your Cancel Date is in the future
A configuration on our EComm system requires that every subscriber have a NOT NULL Cancel Date. This presented some difficulty for calculating tenure, particularly since the default for paying subscribers is 2999-01-01. Wow! That's some great tenure. They've been paying subscribers longer than we've been in business.
For the past few weeks I've been grappling with how to adjust the CancelDt and calculate tenure. Since I'm using Tableau on top of Teradata, I've dug into both and decided on the below.
These calculations are included in a monthly churn workbook I produce for our product team. The workbook is updated at the beginning of each month and analyzes churn and media adoption for the previous month, it's preceding three month average, and the same periods in the previous year. The dates are configured to update dynamically, so the current_date assignment is valid for the entire month the workbook is valid. Once the workbook updates for the following month, you'll be ahead and the current_date assignment will remain in the month ahead of the study month.
Previous method for adjusting the CancelDt and calculating churn:
In Teradata create a view or use custom query for Tableau
Then in Tableau, create a custom metrics to resolve the future date
"CancelMo Truncated"
IIF([CancelMo] > TODAY(), DATETRUNC('month',TODAY()),[CancelMo])
and to calculate "Tenure"
DATEDIFF('month',[SignupMo],[CancelMo Truncated])
Updated Method
Push it all back to the database:
select RptMonth
, Id
, SignupDt
, CASE WHEN CancelDt = '2999-01-01' THEN current_date ELSE CancelDt END CancelDtUpd
, (CancelDtUpd - SignupDt) month(4) as Tenure
, CASE WHEN CancelDt = '2999-01-01' THEN 1 ELSE 0 END PayingFlag
from foo
For the past few weeks I've been grappling with how to adjust the CancelDt and calculate tenure. Since I'm using Tableau on top of Teradata, I've dug into both and decided on the below.
These calculations are included in a monthly churn workbook I produce for our product team. The workbook is updated at the beginning of each month and analyzes churn and media adoption for the previous month, it's preceding three month average, and the same periods in the previous year. The dates are configured to update dynamically, so the current_date assignment is valid for the entire month the workbook is valid. Once the workbook updates for the following month, you'll be ahead and the current_date assignment will remain in the month ahead of the study month.
Previous method for adjusting the CancelDt and calculating churn:
In Teradata create a view or use custom query for Tableau
select RptMonth , Id , SignupDt - (EXTRACT(DAY from SignupDt) -1) as SignupMo , CancelDt - (EXTRACT(DAY from CancelDt) -1) as CancelMo from foo
Then in Tableau, create a custom metrics to resolve the future date
"CancelMo Truncated"
IIF([CancelMo] > TODAY(), DATETRUNC('month',TODAY()),[CancelMo])
and to calculate "Tenure"
DATEDIFF('month',[SignupMo],[CancelMo Truncated])
Updated Method
Push it all back to the database:
select RptMonth
, Id
, SignupDt
, CASE WHEN CancelDt = '2999-01-01' THEN current_date ELSE CancelDt END CancelDtUpd
, (CancelDtUpd - SignupDt) month(4) as Tenure
, CASE WHEN CancelDt = '2999-01-01' THEN 1 ELSE 0 END PayingFlag
from foo
Subscribe to:
Posts (Atom)