Search

Thorsten Brants Phones & Addresses

  • 3750 Ross Rd, Palo Alto, CA 94303 (650) 493-3493
  • 3752 Ross Rd, Palo Alto, CA 94303 (650) 493-3493

Publications

Isbn (Books And Publications)

Tagging and Parsing with Cascaded Markov Models: Automation of Corpus Annotation

View page
Author

Thorsten Brants

ISBN #

3933218055

Us Patents

Systems And Methods For Determining The Topic Structure Of A Portion Of Text

View page
US Patent:
7130837, Oct 31, 2006
Filed:
Mar 22, 2002
Appl. No.:
10/103053
Inventors:
Thorsten H. Brants - Palo Alto CA, US
Francine R. Chen - Menlo Park CA, US
Assignee:
Xerox Corporation - Stamford CT
International Classification:
G06F 17/00
G06N 7/00
G06N 7/08
US Classification:
706 55, 706 46, 706 45
Abstract:
Systems and methods for determining the topic structure of a document including text utilize a Probabilistic Latent Semantic Analysis (PLSA) model and select segmentation points based on similarity values between pairs of adjacent text blocks. PLSA forms a framework for both text segmentation and topic identification. The use of PLSA provides an improved representation for the sparse information in a text block, such as a sentence or a sequence of sentences. Topic characterization of each text segment is derived from PLSA parameters that relate words to “topics”, latent variables in the PLSA model, and “topics” to text segments. A system executing the method exhibits significant performance improvement. Once determined, the topic structure of a document may be employed for document retrieval and/or document summarization.

Method And Apparatus For Generating Overview Information For Hierarchically Related Information

View page
US Patent:
7280957, Oct 9, 2007
Filed:
Dec 16, 2002
Appl. No.:
10/321420
Inventors:
Paula S. Newman - Los Altos CA, US
John C. Blitzer - Fort Myers FL, US
Thorsten H. Brants - Palo Alto CA, US
Assignee:
Palo Alto Research Center, Incorporated - Palo Alto CA
International Classification:
G06F 17/27
US Classification:
704 9, 704 1
Abstract:
A method is provided for digesting the content of hierarchically related information. The method, which obtains relatively short overviews, selects a proportion of representative nodes and then extracts and organizes one or more sentences from the text associated with each selected node. For text trees representing archived discussions, the selection of nodes and sentences is from comment/response sequences drawn from lexically central nodes which will capture those aspects of the discussion considered most important to discussion participants.

Systems And Methods For Sentence Based Interactive Topic-Based Text Summarization

View page
US Patent:
7376893, May 20, 2008
Filed:
Dec 16, 2002
Appl. No.:
10/319544
Inventors:
Francine R. Chen - Menlo Park CA, US
Thorsten H. Brants - Palo Alto CA, US
Annie E. Zaenen - Redwood City CA, US
Assignee:
Palo Alto Research Center Incorporated - Palo Alto CA
International Classification:
G06N 3/00
US Classification:
715254, 715255
Abstract:
Techniques for determining sentence based interactive topic-based summarization are provided. A text to be summarized is segmented. Discrete keyword, key-phrase, n-gram, sentence and other sentence constituent based summaries are generated based on statistical measures for each text segment. Interactive topic-based summaries are displayed with human sensible omitted text indicators such as alternate colors, fonts, sounds, tactile elements or other human sensible display characteristics useful in indicating omitted text. Individual and/or combinations of discrete keyword, key-phrase, n-gram, sentence, noun phrase and sentence constituent based summaries are dynamically displayed to provide an overview of topic and subtopic development within a text. A hierarchical and interactive display of texts based on the use of discrete sentence constituent based summaries which associates expansible and contractible displayed text provides contextualized access to an interactive topic-based text summary and to an original text.

Systems And Methods For Interactive Topic-Based Text Summarization

View page
US Patent:
7451395, Nov 11, 2008
Filed:
Dec 16, 2002
Appl. No.:
10/319508
Inventors:
Thorsten H. Brants - Palo Alto CA, US
Francine R. Chen - Menlo Park CA, US
Annie E. Zaenen - Redwood City CA, US
Assignee:
Palo Alto Research Center Incorporated - Palo Alto CA
International Classification:
G06F 17/00
US Classification:
715254, 715255
Abstract:
Techniques for determining interactive topic-based summarization are provided. A text to be summarized is segmented. Discrete keyword, key-phrase, n-gram, sentence and other sentence constituent based summaries are generated based on statistical measures for each text segment. Interactive topic-based summaries are displayed with human sensible omitted text indicators such as alternate colors, fonts, sounds, tactile elements or other human sensible display characteristics useful in indicating omitted text. Individual and/or combinations of discrete keyword, key-phrase, n-gram, sentence, noun phrase and sentence constituent based summaries are dynamically displayed to provide an overview of topic and subtopic development within a text. A hierarchical and interactive display of texts based on the use of discrete sentence constituent based summaries which associates expansible and contractible displayed text provides contextualized access to an interactive topic-based text summary and to an original text.

Methods, Apparatus, And Program Products For Performing Incremental Probabilistic Latent Semantic Analysis

View page
US Patent:
7529765, May 5, 2009
Filed:
Nov 23, 2004
Appl. No.:
10/996873
Inventors:
Thorsten H. Brants - Palo Alto CA, US
Ioannis Tsochantaridis - Peristeri, GR
Thomas Hofmann - Darmstadt, DE
Francine R. Chen - Menlo Park CA, US
Assignee:
Palo Alto Research Center Incorporated - Palo Alto CA
International Classification:
G06F 7/00
US Classification:
707102, 707 5, 715267
Abstract:
One aspect of the invention is that of efficiently and incrementally adding new terms to an already trained probabilistic latent semantic analysis (PLSA) model.

Systems And Methods For New Event Detection

View page
US Patent:
7577654, Aug 18, 2009
Filed:
Jul 25, 2003
Appl. No.:
10/626856
Inventors:
Thorsten H. Brants - Palo Alto CA, US
Francine R. Chen - Menlo Park CA, US
Ayman O. Farahat - San Francisco CA, US
Assignee:
Palo Alto Research Center Incorporated - Palo Alto CA
International Classification:
G06F 7/00
G06F 17/30
US Classification:
707 6
Abstract:
Techniques for new event detection are provided. For a new story and a corpus of stories, story-pairs based on the new story and each corpus story are determined. Adjustments to the importance of terms are determined based on story characteristics associated with each story. Story characteristics are based on direct or indirect characteristics. Direct story characteristics include authorship, language associated with a story and the like. Indirect story characteristics may include derived characteristics such as an ROI category characteristic, a same ROI characteristic, a same event-same source characteristic, an average story similarity characteristic or any other known or later developed characteristic associated with a story. Adjustments to the inter-story similarity metrics are then determined based on story characteristics and/or a weighting function. New event scores and/or new event categorizations for stories are determined based on the inter-story similarity metrics and the adjustments based on the story characteristics.

Semantic Unit Recognition

View page
US Patent:
7580827, Aug 25, 2009
Filed:
Dec 31, 2003
Appl. No.:
10/748654
Inventors:
Thorsten Brants - Palo Alto CA, US
Jay Ponte - Mountain View CA, US
Assignee:
Google Inc. - Mountain View CA
International Classification:
G06F 17/20
US Classification:
704 1, 704 9, 704257
Abstract:
A semantic locator determines whether input sequences form semantically meaningful units. The semantic locator includes a coherence component that calculates a coherence of the terms in the sequence and a variation component that calculates the variation in terms that surround the sequence. A heuristics component may additionally refine results of the coherence component and the variation component. A decision component may make the determination of whether the sequence is a semantic unit based on the results of the coherence component, variation component, and heuristics component.

System And Method For Providing Text Summarization For Use In Web-Based Content

View page
US Patent:
7587309, Sep 8, 2009
Filed:
Dec 1, 2003
Appl. No.:
10/725883
Inventors:
Christopher Rohrs - Palo Alto CA, US
Thorsten Brants - Palo Alto CA, US
Assignee:
Google, Inc. - Mountain View CA
International Classification:
G06F 17/21
G06F 17/28
US Classification:
704 10, 715236, 715244
Abstract:
A system and method for providing text summarization for use in Web-based content is presented. Text is determined responsive to an executed query. Phrases within the text are identified, and words within the phrases are marked using matches of the words within the phrases with words of the executed query and/or a format rule. Marked words are placed into the summarized text subject to space restrictions. A system and method for building Web-based advertising creatives is also presented. At least one item description responsive to an executed query is identified and a name is extracted. Marked words are placed into the advertising creative subject to space restrictions.
Thorsten H Brants from Palo Alto, CA Get Report