Tuesday, November 12, 2013

Learning Statistics

I've been getting some feedback on my previous post in which people complain how much they hated their Statistics class, and how they didn't feel like they learned much useful.

Perhaps my fresh feeling about statistics is due to my never having taken a formal statistics class.  I pretty much had to teach myself everything I knew.  I thought I would take this opportunity to drop a few hints as to what sources I used to learn statistics.  I'd like to highlight some excellent sources.

My first raw exposure was as a post-doc with Andy Gould at Ohio State University.  Andy is probably the smartest person I've ever worked with.  I'm probably going to mess up his potted biography a bit, since its from memory, but as I recall him saying, Andy had dropped out of college in the '60s to work on an assembly line, I think at Ford, hoping to radicalize the proletariat masses.  But in his spare time, he dabbled in astrophysics, leading him to write a letter to Stephen Hawking pointing out a mistake in a paper on Black Holes.  Hawking suggested he go to graduate school.

In order to go to graduate school, Andy had to finish his undergraduate physics degree.  While going through the usual physics labs we've all taken, he independently invented modern Bayesian statistics, developing his own notation and terminology in the process.  Andy's work as I remember it is centered on Cramér–Rao_bounds and Fisher information, though he had his own names and notation.  He summarized some of his techniques in this ArXiv preprint.  I remember this paper on the optimal design of microlensing experiments as an example of his techniques in application.

I gradually became complacent in my understanding of estimation theory, until I had a big shock when I switched to Medical Physics.  There, my bible was the tome Foundations of Image Science by Barrett and Myers.  I think that this book is of great value for all scientists, not just those working in image theory.  After all, an image is just a collection of data with a particular organization.  These authors brought together a great deal of the general science of data analysis.  The first six chapters are basically an advanced undergraduate degree in Mathematics, chapters 11 and 12 study the statistics of general detectors and photon-noise-limited detectors, while chapter 13 is a study of general statistical analysis.

In particular, these authors divided statistical tasks into Estimation tasks and Classification tasks.  In a nutshell, an estimation task tries to put a number, or multiple numbers on a set of data, while Classification tries to interpret data as leading to a finite number of options.  As a physical scientist, my background was purely in estimation tasks, and the goal of a statistical understanding of an estimation task is to place a properly-sized error bar on a graph, while the appropriate tool is the Cràmer-Rao bound I had learned from Andy.

But, in reality, we are often performing classification tasks, and the true underlying purpose of an estimation task is to distinguish between two distinct underlying theories.  Astronomers love to perform classification tasks with their Type I and Type II supernovae, their Population I and Population II stars, their Elliptical, Spiral, and Irregular galaxies, etc.  The workhorse of the classification task is the ROC curve.

While working for Human Network Labs, I ran into a big cultural difference in statistical analysis between scientists and engineers.  As scientists, we go through distinct phases: we gather data, then we interpret data.  The interpretation is done retrospectively on a complete data set.  But real life is not like that.  We are constantly recieving new information, and have to constantly revise our assessments based on what we've learned.  Roboticists especially have to deal with data in this manner.  Here, there is an emphasis on systems which can be easily updated with the introduction of new data without having to start again from scratch.  I think you can learn a lot about Bayesian statistics by understanding the Particle Filter.

My latest career change into internet advertising forced a new shift into Machine Learning techniques.  As I see it, in physics, there is a theory, or multiple theories, and the goal is to distinguish between multiple theories (classification) or refine the parameters of a theory (estimation).  But when dealing with humans and other living things, there is no theory; humans are complex and we cannot deduce how people will act, even statistically, from first principles like we can for stars and crystals.  We can only describe how people behave based on many different noisy data sources.  "Learning" here can be reduced to drawing a "smooth" model through noisy data in a very high dimensional space.  The best source I have found here is The Elements of Statistical Learning by Hastie, Tibshirani, and Friedman, best in part because it is freely available as a .pdf download.

Thursday, November 7, 2013

Why we need a Common Core

 The ethically-informed, statistically-literate policy maker for our modern multicultural technological society.
Thinking about the ostesible reasons why people should study humanities instead of science.
I saw this quote from the Statistician to the Stars:

Science and math give us terrific toasters, efficient ways of annoying strangers with our electronic toys, and are darn good fields at extracting money from Leviathan. But none of them say word one about what is the best in life, which is the ideal way to live, what life is about, why life even exists, why anything exists, what is good and what evil, what is right and what wrong.

But in my work as a medical physicist, I find that I often have to make tradeoffs that the Humanities people should be making: how can I weigh extra pain and discomfort to a patient vs the cost of some improvement, or the improved accuracy of some diagnosis?  How many unnecessary mastectomys should we perform to save one woman's life from breast cancer? How should I trade off risk to the general public from radiation exposure vs. benefits to the patient of some new procedure vs the cost (in dollars) of shielding?  How do I weigh having more false alarms and potentially scaring patients vs the potential for improved patient care?

These seem like the kind of questions that a humanities person should be helping me with!  But I barely even get help from the physicians who work on the project, let alone our mythical on-call bioethicist or even my poet, philosopher, or artist friends.  People just don't understand the details of the trade-offs -- the math is too hard for them, because they decided they aren't "math people".  So the million small decisions and tradeoffs and some of the large ones end up being made by the engineering team.

To me, this argues for the kind of multi-disciplenary education that I had, but with tweaks.  Humanities people, studying what is right and wrong presumably so that they can best set policy, must must must have a deep knowledge of statistics.  At my college, they were required to study calculus, but calculus is just a tool to solve some problems, and other schools are happy if they can be taught again how to subtract fractions (only to immediatly forget it again).  Statistics is applied epistomology -- it teaches us how to distinguish between what is (likely to be) true and what is (likely to be) false.  Of course, you can't really do statistics without calculus.  If people want to guide society by helping us with these life-or-death questions, they need to have the tools to understand what the engineers who are building society are doing.

Meanwhile, the science geeks should be taught humanities, but with ethics as the focus.  Not the day-to-day ethics of should I accept this gift from a lobbyist, but the overarching ethics of what is good in life, and what should we do with our limited time on this dizzy planet, and what is good for the millions of people who will use the technology we produce or maintain.  Multicultural studies are important because we will have to make decisions that affect people different from us, and we have to understand them, or at a minimum, understand that they may be different from us.  We need to study Plato and Aristotle and the Buddha and Mohommed (PBUH) because without them we can't understand Bentham and Quine, and the latest modern theories of how we can best serve our patients, customers, clients, co-workers, friends and family.

Modern capitalism has an answer to this for the engineers, one they are adept at calculating: do what maximizes the long-term risk-adjusted inflation-adjusted tax-adjusted time-adjusted expected profit, measured in dollars, and to hell with anything else.  In practice, I get more feedback from the marketers and investors about the technical decisions I make than from the physicians and patients.

But at present, until the poets buckle down and learn their statistics, they are stepping away from their responsibilities in a modern technological society.

Friday, August 30, 2013

Equation tester

I've just added the mathjax equation displayer to my blog. This will allow me to put some real science up here with the LaTeX I love. I followed the steps outlined here .

Pythagorus' theorem is $a^2 + b^2 = c^2$. The definition of a limit is $$\lim_{x \rightarrow x_0} f = f_0 \equiv \forall \epsilon: \exists \delta: |f - f_0| < \delta \implies |x - x_0| < \epsilon$$

Monday, November 7, 2011

why doesn't it feel like progress?

John Tamny argues, correctly, that manufacturing job loss is a sign of increasing productivity. So why does it feel like so much pain? What is it about the manufacturing jobs that we actually miss?

America is undoubtedly a country which is much richer now than it was decades ago during the manufacturing heyday.

Much of that growth is due to population growth and the increasing participation of women in the workforce, but the US has been, and remains, the wealthiest large country in the world measured by per capita GNP with a mean wealth that has just been growing pretty much since the founding of the republic.

So why do we feel like crap?

Although the country has grown wealthier on average, that doesn't mean most people have grown wealthier. If the former CEO of a failed company who still managed to get a 10 million dollar golden parachute walks into a bar, the mean wealth of the patrons of the bar goes way up. But the typical patron is still no richer. A better statistic is the median wealth.

Shockingly, median household income has not increased since the 1990's, and has only barely increased since 1967! The typical American family is not much richer now than in the manufacturing heyday of 1967!

(source US census, graph courtesy of NPR)

During this time, since 1967, mean income has nearly doubled! So there is much more wealth per person now than there was in 1967, but the typical household is barely better off. (Part of this is due to the decline in the number of people per household - a family of four needs more people than a single guy living alone)

By now, its no surprise where all the extra money has gone: to the top 1% and really the top 0.1%.

The problem is that the increase in manufacturing productivity and its concomitant loss of jobs has coincided with a rise in the share of the income captured by the few at the top. The line workers at GM who were pulling down $50/hour before they were laid off are now making$10/hour at Walmart while the Walmart CEO is making \$35 million a year (more in an hour than the typical Walmart worker makes in a year)

The fact is that Manufacturing jobs are seen as good jobs: well paid, good benefits. Their loss has been seen as a loss of good jobs. But this is perhaps more a difference in culture: once upon a time manufacturing jobs were Satanic Mills while Andrew Carnegie made millions. But today there is simply a tradition that the manufacturing workers will share in the benefits.

Coming soon, how has this effected the economy for scientists?

Thursday, March 25, 2010

Health Care Rant

My thoughts on health care are summarized by this graph. Here I show life expentancy as a function of how much money is spent on health care. No surprise: there is generally a strong correlation that countries that spend more money on health care are generally healthier than countries that spend less.

There are a few exceptions though. These are the countries at the top of this distribution. The citizens of Vietnam, Cuba, Costa Rica, and Japan enjoy an extra five years of life more than other countries that spend a similar amount. Five years of life expentancy is huge: Japan's life expentancy was five years lower back in 1985. Five years of life expetancy represents thus maybe 25 years of medical advances! According to the trend, to gain five years of life expetancy requires almost quadrupling health care expendatures! It is the difference between a wealthy European country like Germany and a middle-income one like Poland, or the difference Poland and a poor country like Egypt.

Then there are the countries that seem to have particularly disfunctional health-care systems. I show them here. These are the places along the bottom of the distribution. These are places wracked by war or AIDS, disfunctional societies, places that are so curropt that the money spent on their system is squandered so that they have the life-expentancy of countries much poorer than they. This list of countries are places that are legends for waste, corruption, and inefficiency: Sudan, Cambodia, Russia, and the US.

Yes: our current health-care system is as wasteful and curropt as Russia's. Far from being the best in the world, it is on par with Costa-Rica's even though they spend (wait for it...) an eigth of what we spend on health-care. Our life-expetancy is only marginally better than Mexico's even though Mexico is famous for its unhealthy food and high smoking rate.

The US spends much more than other wealthy countries do on healthcare, and well we should: America's GNP per capita is substantially higher than Japan's, Germany's, France's or Britain's. But what is shocking is just how much of that money is wasted, how little we get for it. Not only do these countries spend half of what we do, but they are objectively healthier than we are. Given the basic function of our health care system to keep us all alive, its as if, under the current system, two thirds of the money we spent was siphoned off by curropt bureaucrats.

I worked in healthcare for several years, and I was shocked by the lavish waste I saw every day, the huge sums of money I see spent on equipment we don't use, the lavish salaries of the specialists (including me, though not as lavish as some), the marble plazas, the ridiculous paperwork (my primary care doctor employs more billers than nurses). Fortunately, we are such a wealthy country that we were able to afford such massive waste without it breaking our budget. But no longer. As babyboomers retire and general healthcare inflation keeps going, Medicare will in a few decades crowd out all other federal spending.

But I have a personal stake in the bill (besides wishing that I could have the lifespan of smoking and alcoholic Frenchman). I (along with half the physics department at Reading Hospital) was laid off this year due to the economic crisis (the president of the Hospital had been investing the Hospital's money in Colleteralized Debt Obligations). Even though I'm basically a healthy person who doesn't go to the doctor for more than an annual checkup, I cannot get health insurance because of some obscure pre-existing condition that I never even knew would be an issue. Once my COBRA runs out, I'm fucked. If I get cancer now, they will have to pull the plug on my treatment in a few months. Or else, the doctor who treats me will just have to work for free. We were planning on having a baby, but had to put that off because pregnancy is a "pre-existing condition". If we had already been pregnant, we would have had no choice but to have an abortion because we couldn't afford the obstetrician's fee.

I have a good job now working for an entrepreneurial high-tech startup. But we are mostly paid with stock-options, not much money or health-insurance. Thats the American way, and when our product hits the shelves and we go public, we'll all be rich. But our company is too small to be able to afford health insurance. If HCR had failed, I would have had to quit and tried to get a job at some other place that I hated doing much less valuable work. In fact, our tiny company has opened up offices in Singapore and Taiwan in part because the engineers, programmers and scientists who work there can get health insurance for so much less than it costs here in the US.

To Jeremy and Gian-Carlo: We all had the same advantages: excellent high school education, top colleges and graduate schools, good high-paying jobs with lavish benefits. These benefits gave us the kind of security (at large and growing cost to our employers) that shielded us from the fundamental horror that aflicts the millions of uninsured. But, until 2014 when most of the healthcare reform kicks in, you guys are just one lay-off from being tossed into the ranks of the uninsured and never being able to get insurance for yourselves or your families.

Sunday, March 14, 2010

Cool video showing sky rotation (h/t Andrew Sullivan)

Friday, February 13, 2009

Astronomy public talks

Here are the slide shows for public talks that I've given in astronomy. Talks that I've given in Nuclear Medicine can be found at my other blog: graffnucmed.blogspot.com .

Of these, I think that the one on Enlightenment astronomy has always been my favorite.  I've always loved the interplay of science and philosophy and religion, and this talk brings it out.  The talk on Gravitational Microlensing was given as a job talk and highlights my research, but also that of my many colleagues in this subfield of Astronomy.

Thursday, November 20, 2008

Rollergirl

J'ai dix-huit ans and I am a freestyle slalom skater.

Monday, November 17, 2008

First images of planets

From today's APOD, astronomers have finally imaged planets around other stars. See also this image from September showing the first image of a planet around another star.

Why is this important?

So far, hundreds of planets have been found orbiting other stars. Typically, they are found through detecting the gravitational pull they exert on their star, causing a small wobble in the motion of the star. But all that can be inferred from these discoveries is the existence of the planet, and a rough estimate of its mass.

Capturing light from another planet, and especially capturing a spectrum of that light, allows us to probe the chemical composition of its atmosphere.  Changes in the brightness or color of the planet over time allow us to infer its rotation rate. This will be how life will be discovered on other planets.  The discover article (pdf) shows a nice spectrum of this planet.

These planets were discovered now because they are particularly easy to find.  All the planets are very large (several Jupiter masses) and quite far from their host star.  The innermost planet (labelled d in the image) is at a Neptune's distance from its star.  These planets are thus quite different from the planets of our solar system, and we will need a new theory of planet formation to explain how they got out there.  We're still a long way off from seeing tiny Earth-mass planets orbiting close-in to their sun.

See movies here at the telescope home page to see how the image of the star was removed and then images taken over several years were added together to reduce noise.

Thursday, November 13, 2008

MisEducation

Dear President-Elect Obama,

I and millions of Americans supported, volunteered, voted, and cheered for you over the past several months. Now we are asking you to do the difficult job of holding to your promises.

A crucial promise to the long-term success and sustainability of our country is the promise of education. Rich, engaging, globally competitive education. Or, at the very least, a little reading, writing, and 'rithmetic.

So let’s break for a quiz: Quick, what’s the source of America’s greatness?

Is it a tradition of market-friendly capitalism? The diligence of its people? The cornucopia of natural resources? Great presidents?

No, a fair amount of evidence suggests that the crucial factor is our school system — which, for most of our history, was the best in the world but has foundered over the last few decades. The message for Mr. Obama is that improving schools must be on the front burner.

With respect and hope.

Wednesday, November 12, 2008

Ensemble planeta

I sang this in graduate school.  I guess I'm a sucker for a cappella.

Sunday, November 9, 2008

Reports from a playboy

Peter's new article on Iraq is really funny, albeit reporting on the death of the oldest diaspora Jewish community in the world. Whistling past the graveyard indeed.
BAGHDAD (Reuters) - One of the last eight Jews in Baghdad, a portly retired accountant, erupts in a bellyful of laughter when asked why he never married.

"I was a playboy. Don't write that!" he jokes, grinning. "How old do you think I am? Wrong. I'm 65! Don't write that! Write that I am 55!"
...
How many Jews are there now?

"We know them all," says the ex-accountant, counting.

There's the ex-accountant himself, plus the nephew with whom he shares a rented house in Baghdad's central Karrada district. There's the man who lives near them, the man who leads the community, the very old woman, the male doctor and the female dentist. And the man whose brother was a goldsmith.

The goldsmith married the dentist a few years ago. A few months later, he was abducted by gunmen.

Changing

For a while,  I forgot that anything was going on in the world outside of my immediate, day to day activities and the election. But now I remember.

Can we, on a national level, use the focus and diligence of the volunteers who brought Change to bring environmental change? Can these troops (and more) continue to be rallied? Al Gore speaks of national change and personal change

Saturday, November 8, 2008

Better than his base

The Economist's view of McCain? A fundamentally decent man doomed by his pact with the non-reality-based community.

Friday, November 7, 2008

Emotional Overflow

I don't have the brain space to process all the details of this election, but a friend passed on Judith Warner's NY Times blog which sums up my emotional response (and much of the country's) in an articulate and moving way.

(photo from the article added by David)

Peter's thoughts on Iraq

My brother, currently stationed in Baghdad with Reuters, has his own take on Iraqi's new willingness to negotiate with an Obama administration. His article seems to imply that the Iraqi's were deliberately stalling until after the election.

Obama's support among gays

As noted before, Obama saw a big drop in support among gays. Why?

In 2004, Gay marriage was a huge issue in several states, and is thought to have pushed Bush over the edge in Ohio. But this is a one-time issue: there are only so many times you can pass a constitutional referendum on the same subject. This issue was huge in swing states and may have pushed Gay support for Kerry to higher levels.

But in 2008, there were still some similar amendments, especially in populous Florida and California. Obama's support for Gay marriage in California was notably luke-warm, he basically said that he thought that marriage was between a man and a woman but that the constitution should not be amended to support his belief.

In the primary campaign, Obama gave a big speech to a black congregation in Atlanta where he specifically tied rights for gays and lesbians to the struggle for racial justice saying
And yet, if we are honest with ourselves, we must admit that none of our hands are entirely clean. If we're honest with ourselves, we'll acknowledge that our own community has not always been true to King's vision of a beloved community.

We have scorned our gay brothers and sisters instead of embracing them. The scourge of anti-Semitism has, at times, revealed itself in our community. For too long, some of us have seen immigrants as competitors for jobs instead of companions in the fight for opportunity.

But he made no such effort in the national campaign. Had he pushed harder for black support for marriage equality in California, it might not have foundered.

Statistics on the Election

Kevin Drum posts some interesting statistics of which groups came out for Obama relative to the national average and which did not. Unsurprisingly, Obama did well among the young, but surprisingly only tied Kerry amongst gays and lesbians, and did especially well amongst high earners (the primary victims of his tax plan).

Democrats have been wondering "whats the matter with Kansas?" Now Republicans will have to wonder "whats the matter with Connecticut?" Why is it that so many rich people come out for Obama despite their narrow short-term economic self interest?

I think my Dad fits the bill of one of those high-earners who were converted to Obama. He was turned off by the social conservatism represented by Sarah Palin and Michelle Bachman. But also saw the value of his retirement savings wither away by inflation and the low dollar caused (he says) by Bush's budget deficits. Meanwhile, Bush did not so much cut taxes on high earners as on the wealthy through his massive cuts of inheritance and capital gains taxes. Bush showed his natural affinity for the ne'er-do-well sons of wealthy fathers like himself. Meanwhile, many people with high incomes who live in the coasts don't actually have much wealth: they spend their money on inflated real estate and watched the value of their expensive houses plummet under Bush.

But over on the Corner, there is a worriesome critique of Obama's constituency. Despite all his emphasis on the middle class, it is not totally unfair to characterize much of his support and his supporters as coming from a mix of poor blacks and rich whites. He himself may not fit into either category, but plenty of my fellow Obama volunteers did. My previous sentiments of unity with my black and hispanic co-volunteers should be colored with the yawning gaps in our education levels and prospects.

The response is to note that my current home town, solidly and genuinely middle class West Reading, came out overwhelmingly for Obama (pdf) while the more rural precincts of Berks county, with comperable income and education levels went for McCain. Values, represented in part by our choice to live in our cohesive and walkable town, seemed to trump income.