Secret PhDing: 2008

Freitag, 17. Oktober 2008

17.10.2008

The morning was mainly adminstrational stuff. I met with Djoerd later and we had a good discussion on the the expected probability of relevance. He agrees that the formulation might be good. We think that we could also use the Odds (as done in the binary independence model) instead of the probability of relevance. However, I also see a lot of other advantages. Formally the observations don't have to be the same (we can observe with one shot 100 clicks and with another one none and it would still fit into the formula. Furthermore we could use different features, if available.

In the afternoon I was mainly writing on the TRECVID paper. I have deficcencies to work in one line of thought as it seems. Specially because the content is not interseting to me. At least I switched off the chat and email..

Mittwoch, 15. Oktober 2008

15.10.2008

Today I worked on the Language-Model Relevance Paper and began to work on the Trecvid 2008 paper. For the language relevance-model i got somehow stuck. The ranking of one example query of the cranfield dataset shows really bad results (using "perfect" estimation of p(mu|R)).

Montag, 25. August 2008

Adjustment of Aims

General Direction: Expected Probability of relevance given the distribution of "possible worlds".

Four Main Parts
Detector Selection
Detector Improvement
Probability of Relevance / Odds ...
Evaluation

Planning
Short Term

Improvement of Concept Detectors through Cronological Timeline (Marijn?)
Improving Concept Detectors through User Clicks
Expected Probability of Relevance of Relevance given Speech Lattices

Mid Term

SIGIR
Probability of Relevance Models (BM25 / Language Modelling)
Expected Benefit Ranking Prinicple
Internship at Yahoo

Donnerstag, 29. Mai 2008

Write Write Write

Today was a talk with Djoerd. It turns out that I still don't write enough. Experiments should get
noted down!

Dienstag, 13. Mai 2008

2008 05 13

I started the feature extraction on trecvid 2005 data today to have a comparison.

With finding P(C|R) I unfortunately don't find the right settings back, which gave relatively good results. It seems that lemur implements the okapi method in a partical way (returns a document as early as possible). Therefore the score is not really acurate.

Freitag, 9. Mai 2008

TRECVID

We signed up for the highlevel feature extraction task. For this Arjen contacted Christos from Greece who gave us binaries to extract wilbull features from jpg images. We extract these low level features now and will train (via cross valiadation) visual only models for the concepts. These models will be used to find the occurrence probabilities of the trainingsets of 2007 (and later 2008) data. With these we can make the results of the CIVR paper comparable between 2005 and 2007

Secret PhDing