Search This Blog

Yesteryear

Tuesday, August 19, 2008

August 19, 2008

           Do you know who this is? Who wouldn’t love a face like that? The very picture of loyalty. Fooled you, this is not Millie-Belle. This picture was taken in Colorado approximately four months ago. We don’t have green lawns like that in Florida, at least not much. This doggie is none other than Benjamin, Marion’s dog she rescued from the animal shelter.
           Since there was not much action today, here is a collection of facts that pertain to this year, 2008. Which country has the highest crime rate? New Zealand. (The US ranks 8th.) The highest murder rate? Columbia. (US ranks 26th.) The least corrupt? Iceland. (Canada refused to participate in the survey but is estimated at between 6th and 9th, the US came in at 30th.) [Author’s note: the charging of extra fees to “expedite” applications, whether by an individual or a department, is not considered corruption in Canada, and in fact these fees are often blatantly posted on the wall. In the US, only the passport office does this to the same degree.]
           Today, in the USA, there are 81,420,000 cats. Of all those, I got Pudding-Tat. In Singapore, 100% of the population lives in a city. Germany lost 116,000 aircraft in WWII. Harry Potter, the series, has sold 375,000,000 books, outranking the Bible in all of history. As the media continues to hone its definitions, I’m sure we’ll see plenty more of this Potter-type of record. Musically, the Beatles best single, “Hey Jude” sold just 4,000,000. Mind you, the Beatles still hold the record top albums, at 19. Second prize goes to Elvis at a mere 10. As pointed out elsewhere, Clapton and Hendrix don’t even rate in the top 200, being mere flashes in the pan.
           The German airplanes is a suspect figure. I didn’t know they had produced so many, and I know that number were not shot down. That would have produced the unbelievable total of some 20,000 Allied aces. The number must include aircraft lost by other means than air combat. The Maldives, an island nation in the Indian Ocean, averages 7.8 feet above sea level. That beats Key West. Was it the Maldives that thought they were sinking until somebody concluded it was the ocean rising?
           I was taking a closer look at data mining when I came across a book of facts about the year 2008. Interesting, because that year has not yet finished occurring. It covered stuff like the above which is not likely to change much. From time to time, I play a round of Ridiculist, a computer chat game I’ve previously mentioned. I had to smile at the number of questions that used the very book I was holding as a reference.
           Allow me to clear up a misconception about data mining. It is not anything new, the only difference is that the process has become computerized. Data mining is nothing more than going over masses of information looking for useful patterns, or in some cases just any patterns. It used to be done by hand, for example, when I authored my (not yet famous) annual report of the most successful families at the phone company, it was largely based on a count of surnames on the roster.
           By matching that tally to the published union pay rates, it was possible to calculate which families had the highest payroll. For the curious, the record was a family with 472 members with a combined income of over $1,000,000 per month. I can’t recall the name but it was something like “LeBlanc”. And that doesn’t even include related women whose surnames had changed through marriage.
           Databases are a data mining dream, as they purportedly contain columns of unique (normalized) information. It is child’s play to develop queries that amalgamate these columns into patterns. The technique will always suffer from operator errors. Although I have not found any examples, I am certain that advanced data mining is heavily dependent on statistical theories. And that is why I think you should use a fake name on your grocery card. I was unable to find any upper level information on the state of data mining as it currently exists. Another one of those “computer” operations that they don’t want any outsiders knowing exactly what is going on.