By admin , 26 February 2011

Last week I finished a very interesting book, Data-Intensive Text Processing with MapReduce. For those of you interested in such matters, I can recommend this short paper by researchers at Google: "The Unreasonable Effectiveness of Data" (PDF). It makes the case that simple algorithms and models that scale well will outperform sophisticated algorithms and models that scale less well, given enough data.

This is particularly important in the field of human language processing, where two developments are intersecting. First, there is the availability of vast corpora of text harvested from the internet. Second, algorithms such as MapReduce can now provide near-perfect up-scaling of computational power. That means if you double the amount of computers available to an algorithm, the algorithm can now run at (almost) exactly at twice the speed. That provides the scalability needed to deal with these huge data-sets.

This is in contrast to older approaches in the field, where researches tried to model hand-coded grammars and ontologies, represented as complex networks of relations. As the article points out, this dichotomy is an oversimplification, and in practice researches combine "deep" approaches with statistical approaches.

From the article:

"So, follow the data. Choose a representation that can use unsupervised learning on unlabeled data, which is so much more plentiful than labeled data. Represent all the data with a nonparametric model rather than trying to summarize it with a parametric model, because with very large data sources, the data holds a lot of detail. For natural language applications, trust that human language has already evolved words for the important concepts. See how far you can go by tying together the words that are already there, rather than by inventing new concepts with clusters of words. Now go out and gather some data, and see what it can do."

Cool stuff, and fun to read about.

Topic
By admin , 25 February 2011

It's beautiful to see a real change in paradigm happening. I remember in college how much I enjoyed programming in functional languages, and how cool it is to be able to look at problems from a different viewpoint. What Google and others have achieved with MapReduce a similar change in the way of looking at problems.

MapReduce is the name of Google's base algorithm for their processing of huge data sets. Since then, other companies have followed suit. I didn't know much about this field and this book is a great introduction. It provides a good description of the foundation, and I love it that it describes practical uses. Examples they gave are machine translations, Google's PageRank, shortest path in a graph etc.

Actually in use

What I like about MapReduce is that it provides an abstraction for distributed computing that is actually being used and is succesful. The book showed the scaling characteristics of an example algorithm (strips for computing word co-occurrence) on Hadoop: a R^2 of 0.997! That means that there is almost a linear scalability increase when you add extra machines.

Want to read more

This is one of those books that makes you want to read more. For example, since reading this book I've looked into terms such as Zipfian, Brewer's CAP Theorem and Heap's Law. I still need to learn more about Expectation Maximization and "Hidden Markov Models", harping back on some fundamental mathematics I had in college.

I want to read more about machine translations now, Koehn's book perhaps. And definitely want to read the Google article, about "unreasonable effectiveness of data".

This is an excellent book, which provides a very readable introduction to the algorithms and real-world implementations.

By admin , 22 February 2011

This event is prior to the first soccer match of the Dayton Dutch Lions at 7.30pm (the club also having a Dutch background) and many traditional Dutch food and non-food products will be sold in a market in front of the stadium.

By admin , 22 February 2011

Event is in Dutch.

The Netherland Club en de besturen van VVD USA, D66 USA en PvdA NY nodigen u van harte uit om de uitslagen van de Provinciale Statenverkiezingen 2011 live te volgen op een groot scherm in de Lounge van The Netherland Club, gevolgd door een panel over de verkiezingsuitslagen en de mogelijke gevolgen daarvan voor de samenstelling van de 1ste kamer en het politieke landschap.

Verkiezingsuitslagen: vanaf 5 pm

Panel: 7.30 pm

RSVP: partijennyc@gmail.com

By admin , 22 February 2011

Photographers Laura Holley and Saskia Leary capture the images of flowers in their color photography. Leary calls this series of photographs “Dutch Delight” because so many of the images were taken at the National Park “The Keukenhof”, located in The Netherlands. For more than 60 years the Park has been a show case for bulb growers to display new varieties and colors. Holley’s detailed “close-up images of flowers have a calming effect as the viewer is drawn into the shapes, color and patterns.

Artist Reception: Fri Mar 25 6-8 p.m.

By admin , 22 February 2011

Economic ties with the Netherlands support more than 700,000 American jobs, according to a new report by the Royal Dutch Embassy. Trade and investment between the USA and the Netherlands pay dividends in imports, exports and job creation in both countries.

The Dutch are the third largest investor ($238 billion) in the United States, after the United Kingdom and Japan. In turn, the USA is the largest foreign investor in the Netherlands, with investments of $472 billion. Using data derived from the U.S. Department of Commerce’s Bureau of Economic Analysis and the Census Bureau, the report calculates that exports to the Netherlands and investment by Dutch companies such as AkzoNobel, Heineken, ING, Philips, Randstad, Royal Dutch Shell and Unilever supported more than 704,000 jobs in the United States from 2008-2009.

A state-by-state-breakdown of Dutch investments and trade

The report provides a state-by-state breakdown of Dutch foreign direct investments and trade with the Netherlands. The three states benefiting the most from this economic relationship are Texas, California and Massachusetts. The report includes a clickable map of the United States with state specific information on Dutch investment and trade.

Dutch Ambassador Renée Jones-Bos: "The report makes clear that many American jobs are the result of trade and foreign investment. While not underestimating the importance of emerging markets, this report shows how important existing economic ties are and continue to be for growth and recovery. Our strong economic bond has been forged by four centuries of shared ideals, business values and a commitment to entrepreneurism."

A Heineken representative in the USA added, "The United States has been an important export market for Heineken since the repeal of Prohibition", stated Dan Tearno, Senior Vice President and Chief Corporate Relations Officer for Heineken USA. "The company’s United States presence is a significant part of its history, its heritage and its success".

AkzoNobel also noted its commitment to the U.S. "A strong, vibrant relationship between the Netherlands and United States has - and will continue to be - vital to the long term success of our company," said Erik Bouts - managing director of AkzoNobel's U.S. Paints Business. "As the world's largest paints and coatings company, our U.S. interests are significant in terms of revenue generation, employee base and AkzoNobel shareholders. And recent announcements like becoming the primary paint supplier to Walmart's 3,500 U.S. stores with the Glidden brand provide not only near-term job creation, but more broadly, evidence that AkzoNobel has a bright future in the U.S."

Texas, California and Massachusetts benefiting the most from Dutch-American economic ties

Economic Ties between the USA and the Netherlands reveals interesting data on the role of the Netherlands as a contributor to the economic engine of every state. For example, while New York City ranks fifth for jobs supported by Dutch industry and exports to the Netherlands, it’s in first place of the American cities that trade with the Netherlands.

In Texas, Dutch foreign direct investment and Texas' exports to the Netherlands supported 100,062 jobs from 2008-2009. The more than 60,000 jobs supported in California are part of the Dutch foreign direct investments in the state that account for $5.8 billion or 5.3% of total FDI to California. The Netherlands are the fifth largest investor in California. Dutch ties support 51,410 jobs in Massachusetts, which is equal to 8% of the population of Boston. Massachusetts received $3 billion in Dutch foreign direct investments, which comprised 11.6% of total FDI to the state.

Economic Ties between the USA and the Netherlands: Allies in Open Markets for Mutual Prosperity
A report by the Royal Dutch Embassy in Washington, D.C.
http://www.economicties.org

Topic
By admin , 18 February 2011

Two Artists in Dialog: Tantillo-Whitbeck- - A Discussion of Contrasting Styles from a Common Source

Themes from the 17th Century Dutch animate the work of both Len Tantillo and James Whitbeck. Yet, the two manifest their work in contrasting styles.

Len Tantillo is well established with his work depicting historical moments of the Hudson Valley in a panoramic landscape style.

James Whitbeck, a native of the Berkshires, is establishing himself with work evocative of the 17th century Dutch masters in still life.

By admin , 17 February 2011

Tomorrow it will be 75 degrees, according to the weather forecast (24 in Celsius). Pretty amazing for February.

My little nephew Jasper is doing well; Ettie told me he has gotten his second tooth.

Topic