[TriZPUG] PyCon Presentations?

Erik Rose grinch at grinchcentral.com
Thu Feb 19 15:19:09 CET 2015


I'm giving an in-depth tutorial about elasticsearch, the next generation of the one Laura Thomson and I gave (to 4.3/5 reviews) at OSCON 2014:

https://us.pycon.org/2015/schedule/presentation/330/

They aren't making the full outline visible on the site, so here it is:

- Intro (10 minutes)
    - Humble Beginnings
    - Doc Deficiencies
    - ?pretty Please
    - REST, Lucene, and CAP
    - Don't Use Elasticsearch If…
- Practical Self-Defense (10 minutes)
    - Zen Multicast
    - Network Binding
    - Cluster Name
- Data Structure Basics (20 minutes)
    - Document IDs
    - Type-guessing
    - What mappings are
    - Exercise: GET and DELETE docs
    - Parallels with DB indexing
    - Arrays and how they're searched
- Mappings (15 minutes)
    - Not Really Schemaless
    - Data Types
    - Best Practices
    - Lurking Horrors
- Queries (45 minutes)
    - Filters vs. Queries
    - Term vs. match and why this will save you days of pain
    - Match phrase queries
    - Faceting
    - Scoring
        - Custom-scoring queries
        - Boosts
    - Exercises
        - Writing different types of queries
        - Using explain to dig into scoring
- Analysis (20 minutes)
    - Relationship with inverted index
    - 4-fold path to analysis
        - Char filter
        - Tokenizer
        - Token filter
        - Analyzer
    - What Analyzers Can Do
    - Choosing appropriate analysis: what kinds speed which queries?
    - Common Cases
        - Names
        - Street Addresses
    - Multi-language support
    - Query analyzers (vs. index analyzers)
    - Shrinking your index
        - What's the point?
            - Is every part of your index equally hot?
            - Is your index bigger than RAM?
            - How's your I/O speed?
        - Compression
        - `_source`: to store or not to store?
    - Exercises
        - Testing analyzers with the _analyze API
        - Write a mapping, including appropriate analyzers, to improve
          upon the default
        - Multi-word tags indexed as atoms
- Interfacing With Python (likely broken up and scattered throughout
  the other sections) (15 minutes)
    - Overview of Libraries
    - Bulk Indexing
    - Data Update Strategies
    - Result Fetching Alternatives
    - Testing
- Clustering (15 minutes)
    - Shards and Replicas
    - Pitfalls
    - General Best Practices
    - Why Best Practices Aren't Generalizable
    - Adding New Nodes Without Downtime
    - Split Brain: You're Not Feeling Lucky
    - Monitoring
- Optimization (10 minutes)
    - RAM
        - mlockall
        - ES_HEAP_SIZE at 50%
    - File Descriptors
    - Stores
    - The JVM, GC, And You
    - MySQL Horrors
    - Shrinking Indices
    - Filter Caching
- Dealing With The Future (10 minutes)
    - Changing mappings
    - Mergeable and unmergeable changes
    - Reindexing
    - ES As A Primary Datastore
- Fancy Features (10 minutes)
    - Synonyms
    - Suggesters
    - Autocompletion
    - Percolation

Maybe I'll see some of you there!

Cheers,
Erik

> On Feb 18, 2015, at 4:23 PM, Chris Calloway <cbc at unc.edu> wrote:
> 
> Who among TriPython is giving a presentation (talk/poster/tutorial) at PyCon?



More information about the TriZPUG mailing list