You are currently viewing the "Outline" mode of a pure HTML presentation
developed using S5.
This presentation can be downloaded in it's entirty as
a single zip file.
To view the presentation as a slide show, you can click the
±
in the upper right hand corner of the page. To control the slideshow,
mouse over the lower section of the screen to make the HUD controls
visible, or hit space to advance the slides.
This help text will be hidden when printing this presentation.
Domain Driven Boosting & User Driven Biasing
Chris Hostetter
Apache Lucene/Solr SFBay Meetup
2012-06-26
Boosting & Biasing
Classical IR algorithms aren't the end all be all
- TF/IDF doesn't know your domain
- TF/IDF doesn't know your users
Examples For Today
- Data Domain: Tech Products
- Popularity
- Release date
- Price & margin
- Product type or category
- Users
- Search your website
- Buy products
Domain Driven
Boosting
Trivial Example
q = the user input
vs.
q = +(the user input) suck:false^0.2
Real World Example (Ancient)
q = +(the user input)^100
(*:* -cat:(7, 9, 23))^5
rating:[4 TO *]^3
rating:[7 TO *]^10
popularity:[* TO 1000]^5
Real World Example (Today)
qq = the user input
q = {!boost b=$b v=$qq}
b = div($good,price)
good = mul(rating,popularity,cat_weight))
User Driven
Biasing
Prerequisites
- Analyze User Behavior
- Determine User Preferences
- Quantify Strength of Preferences
Simple Example: Sam
- Tends to buy cheap products
- Recently looking at a lot of laptops
q = {!boost b=$b v=$qq}
b = mul(query($pref),pow($diff,$diffs))
pref = cat:laptop
diff = div(1,price)
diffs = 0.72
Simple Example: Sally
- Tends to buy new products
- Recently looking at "Apple" products
q = {!boost b=$b v=$qq}
b = mul(query($pref),pow($diff,$diffs))
pref = mfg:Apple
diff = recip(ms(NOW,proddate),3.16e-11,1,1)
diffs = 0.72
Sweet Spot Example
Recently clicked on $1000-1200 price facet
q = {!boost b=$b v=$qq}
b = div(1,sqrt(sum(1,$mult)))
mult = mul($s,sub(sum(abs(sub($bias,$min)),
abs(sub($bias,$max))),
sub($max,$min)))
bias = price
min = 1000
max = 1200
s = 0.08
In Conclusion
- You know more about your users and your data then some soul less algorithm
- Think outside the box