Valleywag – valleywag.wordpress.com

Archive for the ‘Search’ Category

The Official Google Blog - Insights from Googlers into our products, technology and the Google culture

A few weeks back Udi Manber introduced the search quality group, and the previous posts in this series talked about the ranking of documents. While the ranking of web documents forms the core of what makes search at Google work so well, your search experience consists of much more than that. In this post, I’ll describe the principles that guide our development of the overall search experience and how they are applied to the key aspects of search. I will also describe how we make sure we are on the right track through rigorous experimentation. And the next post in this series will describe some of the experiments currently underway.

Let me introduce myself. I’m Ben Gomes, and I’ve been working on search at Google since 1999, mostly on search quality. I’ve had the good fortune to contribute to most aspects of the search engine, from crawling the web to ranking. More recently, I’ve been responsible for the engineering of the interface for search and search features.

A common reaction from friends when I say that I now work on Google’s search user interface is “What do you do? It never changes.” Then they look at me suspiciously and tell me not to mess with a good thing. Google is fine just the way it is — a plain, fast, simple web page. That’s great, but how hard can that be?”

To help answer that question, let me start with our main goal in web search: to get you to the web pages you want as quickly as possible. Search is not an end in itself; it is merely a conduit. This goal may seem obvious, but it makes a search engine radically different from most other sites on the web, which measure their success by how long their users stay. We measure our web search success partly by how quickly you leave (happily, we hope!). There are several principles we use in getting you to the information you need as quickly as possible:

  • A small page. A small page is quick to download and generally faster for your browser to display. This results in a minimalist design aesthetic; extra fanciness in the interface slows down the page without giving you much benefit.
  • Complex algorithms with a simple presentation. Many search features require a great deal of algorithmic complexity and a vast amount of data analysis to make them work well. The trick is to hide all that complexity behind a clean, intuitive user interface. Spelling correction, snippets, sitelinks and query refinements are examples of features that require sophisticated algorithms and are constantly improving. From the user’s point of view search, almost invisibly, just works better.
  • Features that work everywhere. Features must be designed such that the algorithms and presentation can be adapted to work in all languages and countries. Consider the problem of spell correction in Chinese, where user queries are often not broken up into words or Hebrew/Arabic, where text is written right to left (interestingly, this is believed to be an example of first-mover disadvantage — when chiseling on stone, it is easier to hold the hammer in your right hand!).
  • Data driven decisions – experiment, experiment, experiment. We try to verify that we’ve done the right thing by running experiments. Designs that may seem promising may end up testing poorly.

There are inherent tensions here. For instance, showing you more text (or images) for every result may enable you to better pick out the best result. But a result page that has too much information takes longer to download and longer to visually process. So every piece of information that we add to the result page has to be carefully considered to ensure that the benefit to the user outweighs the cost of dealing with that additional information. This is true of every part of the search experience, from typing in a query, to scanning results, to further exploration.

Having formulated your query correctly, the next task is to pick a page from the result list. For each result, we present the title and url, and a brief two line snippet. Pages that don’t have a proper title are often ignored by users. One of the bigger recent changes has been to extract titles for pages that don’t specify an HTML title — yet a title on the page is clearly right there, staring at you. To “see” that title that the author of the page intended, we analyze the HTML of the page to determine the title that the author probably meant. This makes it far more likely that you will not ignore a page for want of a good title.

We have been making improvements to our snippets over time with algorithms for determining the relevance of portions of the page. The changes range from the subtle we highlight synonyms of your query terms in the results to more obvious. Here’s an example screenshot where the user searched for “arod” and you can see that Alex and Rodriguez are bolded in the search result snippet, based on our analysis that you might plausibly be referring to him:

As a more obvious example, we now extract and show you the byline date from pages that have one. These byline dates are expressed in a myriad formats which we extract and present uniformly, so that you can scan them easily:

For one of the most common types of user needs, navigational queries — where you type in the name of a web site you know — we have introduced shortcuts (we refer to them as sitelinks). These sitelinks allow you to get to the key parts of the site and illustrate many of the same principles alluded to above; they are a simple addition to the top search result that adds a small amount of extra text to the page.

For instance, the home page of Hewlett-Packard has almost 60 links, in a two-level menu system. Our algorithms, using a combination of different signals, pick the top ones among these that we think you are most likely to want to visit.

What if you did not find what you were looking for among the top results? In that case, you probably need to try another query. We help you in this process by providing a set of query refinements at the bottom of the results page — even if they don’t give you the query that you need, they provide hints for different (likely more successful) directions in which you could refine your query. By placing the query refinements at the bottom of the page, the refinements don’t distract users, but are there to help if the rest of the search results didn’t serve a user’s information need.

I’ve described several key aspects of the search experience, including where we have made many changes over time — some subtle, some more obvious. In making these changes to the search experience, how do we know we’ve succeeded, that we’ve not messed it up? We constantly evaluate our changes by sharing them with you! We launch proposed changes to a tiny fraction of our users and evaluate whether it seems to be helping or hurting their search experience. There are many metrics we use to determine if we’ve succeeded or failed. The process of measuring these improvements is a science in itself, with many potential pitfalls. Our experimental methodology allows us to explore a range of possibilities and launch the ones that work the best. For every feature that we launch, we have frequently run a large number of experiments that did not see the light of day.

So let me answer the question I started with: We’re actually constantly changing Google’s result page and have been doing so for a long time. And no, we won’t mess with a good thing. You won’t let us.

In the next post in this series, I’ll talk about some of the experiments we are running, and what we hope to learn from them.

Clickry Post Source Link

MENLO PARK, California (Reuters) – A start-up led by former star Google engineers on Sunday unveiled a new Web search service that aims to outdo the Internet search leader in size, but faces an uphill battle changing Web surfing habits.
Cuil Inc (pronounced “cool”) is offering a new search service at www.cuil.com that the company claims can index, faster and more cheaply, a far larger portion of the Web than Google, which boasts the largest online index.
The would-be Google rival says its service goes beyond prevailing search techniques that focus on Web links and audience traffic patterns and instead analyzes the context of each page and the concepts behind each user search request.
“Our significant breakthroughs in search technology have enabled us to index much more of the Internet, placing nearly the entire Web at the fingertips of every user,” Tom Costello, Cuil co-founder and chief executive, said in a statement.
Danny Sullivan, a Web search analyst and editor-in-chief of Search Engine Land, said Cuil can try to exploit complaints consumers may have with Google — namely, that it tries to do too much, that its results favor already popular sites, and that it leans heavily on certain authoritative sites such as Wikipedia.
“The time may be right for a challenger,” Sullivan says, but adds quickly: “Competing with Google is still a very daunting task, as Microsoft will tell you.”
Microsoft Corp, the No. 3 U.S. player in Web search has been seeking in vain, so far, to join forces with No. 2 Yahoo Inc to battle Google.
Cuil was founded by a group of search pioneers, including Costello, who built a prototype of Web Fountain, IBM’s Web search analytics tool, and his wife, Anna Patterson, the architect of Google Inc’s massive TeraGoogle index of Web pages. Patterson also designed the search system for global corporate document storage company Recall, a unit of Australia’s Brambles Ltd Continued…

Clickry Post Source Link

After vague statements over the last weeks about internal investments that will allow it to compete in search without Yahoo, Microsoft on Wednesday laid out more of its vision for improving on its current “underdog” position in search.

While describing some new search technologies from Microsoft and some future ideas, executives were also cautious to repeat that theirs is a long-term vision that may take a while to spell success for the company. They spoke during an annual get together for advertisers, this year hosted on Microsoft’s campus in Redmond, Washington.

“I have to say, it’s kind of fun to be the underdog,” Microsoft Chairman Bill Gates confessed. The company has put an unusual effort toward building the team that’s working on search, he said. “We’ve done more on this to build a great team then on any effort I can remember,” he said.

Users should expect to see new features every six months from Microsoft’s search group, he said. “We have a long-term commitment,” Gates said. The company is willing to experiment, he said.

Wednesday’s launch of Cashback represents the latest new feature. When Web users search for a product on Live.com, results may feature a Cashback tag. If users end up buying a product with the tag, they’ll receive money back.

Microsoft expects that the concept will create a whole new business model, though it also expects that it might take some time for it to shake up the industry. “We understand this is a journey. When you change the user experience or business model, it takes time to percolate through to behavior changes,” said Satya Nadella, senior vice president of the search, portal and advertising platform group at Microsoft.

Gates pointed out how Cashback is different than existing search advertising methods. “In search, when you get those ads, in a sense you don’t get anything back in return,” he said. That compares to other media like TV or radio, where in exchange for advertisements, viewers and listeners get content.

Cashback “gives you a reason why you should use a particular search,” he said.

Over 700 merchants including eBay, Barnes and Noble, Sears, Circuit City, Home Depot, Zappos.com, Overstock.com and Kmart have signed up to advertise as part of the Cashback program. “That confirms there is this opportunity for change,” Gates said.

Clickry Post Source Link

A man is trapped in the debris in earthquake-hit Beichuan county

Thousands of people are still trapped beneath ruined buildings

A massive search and rescue operation is under way in south-western China after one of the most powerful earthquakes in decades.

Troops have arrived in Wenchuan county at the epicentre, which was largely cut off by the quake – but heavy rain is hampering rescue operations.

Elsewhere in Sichuan province, frantic efforts are being made to reach thousands of people under the rubble.

The death toll is now more than 12,000, officials say, and looks set to rise.

Chinese rescuers search a collapsed building for survivors in Beichuan, Sichuan province, on Tuesday

In one city, Mianyang, near the epicentre, more than 18,000 people are said to be buried under the rubble and 3,629 have been confirmed dead, state news agency Xinhua reports.

In the nearby town of Mianzhu, at least 4,800 people are trapped under the rubble and massive landslides have buried roads to outlying villages, Xinhua says.

Premier Wen Jiabao was quick to reach the scene and urged rescuers to clear roads into the worst-hit areas as fast as possible.

“As long as there is even a little hope, we will redouble our efforts 100 times and will never relax our efforts,” he told crying locals through a loudhailer in the badly hit Dujiangyan city, south-east of the epicentre.

The health ministry has made an urgent appeal for people to give blood to help the injured.

Clickry Post Source Link

A man is trapped in the debris in earthquake-hit Beichuan county

Thousands of people are still trapped beneath ruined buildings

A massive search and rescue operation is under way in south-western China after one of the most powerful earthquakes in decades.

Troops have arrived in Wenchuan county at the epicentre, which was largely cut off by the quake – but heavy rain is hampering rescue operations.

Elsewhere in Sichuan province, frantic efforts are being made to reach thousands of people under the rubble.

The death toll is now more than 12,000, officials say, and looks set to rise.

Chinese rescuers search a collapsed building for survivors in Beichuan, Sichuan province, on Tuesday

In one city, Mianyang, near the epicentre, more than 18,000 people are said to be buried under the rubble and 3,629 have been confirmed dead, state news agency Xinhua reports.

In the nearby town of Mianzhu, at least 4,800 people are trapped under the rubble and massive landslides have buried roads to outlying villages, Xinhua says.

Premier Wen Jiabao was quick to reach the scene and urged rescuers to clear roads into the worst-hit areas as fast as possible.

“As long as there is even a little hope, we will redouble our efforts 100 times and will never relax our efforts,” he told crying locals through a loudhailer in the badly hit Dujiangyan city, south-east of the epicentre.

The health ministry has made an urgent appeal for people to give blood to help the injured.

Clickry Post Source Link

This April 30, 2008 file photo shows an exterior view of Yahoo headquarters in Sunnyvale, Calif. Microsoft Corp. has withdrawn its $42.3 billion bid to buy Yahoo Inc., scrapping an attempt to snap up the tarnished Internet icon in hopes of toppling online search and advertising leader Google Inc. The decision to walk away from the deal came Saturday May 3, 2008 after last-ditch efforts to negotiate a mutually acceptable sale price proved unsuccessful. (AP Photo/Paul Sakuma, File)

Yahoo Inc. and McAfee Inc. are joining to offer alerts about potentially dangerous Web sites alongside search results generated at Yahoo.com.

With the new security feature — slated to take effect Tuesday — people who search the Internet using Yahoo will see a red exclamation point and a warning next to links McAfee has identified as serving dangerous downloads or using visitors’ e-mail addresses to send out spam.

Dangerous downloads can include “adware,” which shows unwanted advertisements; “spyware,” which secretly tracks users’ keystrokes and other actions; and other malicious programs that can give criminals control over users’ computers.

Yahoo and McAfee hope the move will quell users’ anxiety about accidentally clicking on malicious links.

“Yahoo users have clearly told us that among the most important concerns for them are all these lurking threats on the Internet,” said Priyank Garg, director of product management for Yahoo’s search division. “They know the damage they can do but they don’t know how to protect themselves.”

Yahoo has decided to simply nuke the worst offenders — sites that attempt “drive-by downloads,” or trying to automatically install malicious code on visitors’ computers by exploiting coding flaws in their Web browsers.

If McAfee has identified a site as having employed such tactics, Yahoo users won’t see the link at all.

“When a user gets a set of search results, there’s really no indication of who’s a good guy and who’s a bad guy,” said Tim Dowling, vice president of McAfee’s Web Security Group. “You’re really leaping off a platform of faith that you’re clicking on a site that’s safe and not one that’s bad. And the bad guys really try hard to look good.”

The companies declined to reveal the financial terms of the partnership.

The deal represents the latest attempt by Sunnyvale-based Yahoo to lure more search requests, snap out of its recent financial funk and steal advertising dollars from search leader Google Inc. as it tries to justify its rebuff of Microsoft Corp.’s $47.5 billion takeover bid.

Clickry Post Source Link

Google researchers say they have a software technology intended to do for digital images on the Web what the company’s original PageRank software did for searches of Web pages.

On Thursday at the International World Wide Web Conference in Beijing, two Google scientists presented a paper describing what the researchers call VisualRank, an algorithm for blending image-recognition software methods with techniques for weighting and ranking images that look most similar.

Although image search has become popular on commercial search engines, results are usually generated today by using cues from the text that is associated with each image.

Despite decades of effort, image analysis remains a largely unsolved problem in computer science, the researchers said. For example, while progress has been made in automatic face detection in images, finding other objects such as mountains or tea pots, which are instantly recognizable to humans, has lagged.

“We wanted to incorporate all of the stuff that is happening in computer vision and put it in a Web framework,” said Shumeet Baluja, a senior staff researcher at Google, who made the presentation with Yushi Jing, another Google researcher. The company’s expertise in creating vast graphs that weigh “nodes,” or Web pages, based on their “authority” can be applied to images that are the most representative of a particular query, he said.

The research paper, “PageRank for Product Image Search,” is focused on a subset of the images that the giant search engine has cataloged because of the tremendous computing costs required to analyze and compare digital images. To do this for all of the images indexed by the search engine would be impractical, the researchers said. Google does not disclose how many images it has cataloged, but it asserts that its Google Image Search is the “most comprehensive image search on the Web.”

The company said that in its research it had concentrated on the 2000 most popular product queries on Google’s product search, words such as iPod, Xbox and Zune. It then sorted the top 10 images both from its ranking system and the standard Google Image Search results. With a team of 150 Google employees, it created a scoring system for image “relevance.” The researchers said the retrieval returned 83 percent less irrelevant images.

Clickry Post Source Link


Top Clicks

  • None

Blog Stats

  • 4,857 hits

Recent Comments

peter on Russian babe
www.viewmy.tv on Blinkx Dabbles in Broadband TV…

Categories

May 2024
M T W T F S S
 12345
6789101112
13141516171819
20212223242526
2728293031