Friday, March 28, 2014

I'm speaking at #SP24 - The free 24h SharePoint online conference



I'm excited to be part of #SP24! My session will be: "Custom Indexing Connectors - How to integrate external content into SharePoint Search"

The conference trailer (wow!):


My 1-minute speaker video:

My actual session recording is already done. We had some cool equipment like professional lights, dedicated mic, cam and plant - and it went quite well. Stay tuned :)

Sunday, March 23, 2014

My notes for Hybrid Search from the SharePoint Conference #SPC14

This is my collection of information around Hybrid Search gotten from session recordings from the SharePoint Conference 2014.

Emphasis is on volatile information snippets like quotes from session speakers you would probably never ever find again - and info that makes your head say "I should remember this".

SPC306 - "Best practices for Hybrid Search deployments"

Speakers: Brent Groom, Norm Lambert
Video

General Info
  • Crawls are done on the respective system - one crawl on-prem and one crawl in Office 365
Best Practices
  • Microsoft doesn't add Office 365 search results as Result Block to on-prem search results but separates on-prem and remote results
  • Outbound or Inbound Hybrid Search? Look where you have the most content and decide accordingly.
  • Don't pass all search requests to the remote SharePoint - use keywords or similar triggers instead
  • Perform latency tests - since querying local and remote SharePoint is done synchronously the user always has to wait for both - latency can be critical here
    • If latency is too high the recommendation is to write your own search results web part (at cost of OOB functionality) ...
  • Use Microsoft WAP as reverse proxy
  • Don't use Microsoft UAG as reverse proxy
  • Fiddler can be used as reverse proxy during development
Restrictions
  • Custom Display Templates are not possible for Hybrid Search Results
  • Stress testing against Office 365 is not permitted
Crystal Ball

Currently remote search results are displayed like Federated Search results - improving this situation is on the roadmap (interleaving search results, "from query rules to remote index"), but there is no date yet.

SPC320 - "Hybrid Search: Configure Outbound Hybrid Search in SharePoint Online with Password Sync"

Speakers: Manas Biswas, Neil Hodgkinson
Slides
Video

General Info
  • ADFS not needed for setting up hybrid search (only needed for SSO) - DirSync takes care of transferring claims from on-prem to O365
  • Security Trimming is applied to search results
  • There are two flavors of Hybrid: Hybrid between on-prem SharePoint and Office 365 is different from between on-premise and Dedicated SharePoint (Hosted)
    • the latter doesn't need the ACS but only standard domain trust (?)
  • Interesting: Configuring trust between on-premise farm and Office 365 makes use of high-trust app terminology and cmdlets (app principal, certificates, token issuer...)
    • O365 is being treated like a high-trust app (?)
  • Key claims used for validation users between on-prem and O365:
    • User principal name (UPN)
    • Name Identifier (e.g. SID)
    • SMTP address
    • SIP address
  • Result Source Protocol for remote search results is "Remote SharePoint"
  • Document Preview in Search Results on-prem and Office 365 should work "in both locations" if Office Web App components are installed in "both locations"
Best Practices
  • DirSync schedule is mentioned as running every 3 hours
  • Start setting up Outbound Hybrid Search - it is the easiest one and prepares you for Inbound and Two-Way Hybrid
  • Use self-signed certificates for STS trust
Restrictions
  • Refiners don't work for O365 search results in demoed Outbound scenario (doh!)
  • Cannot use domain-issued certificates for establishing STS trust
Crystal Ball

Asked about refiners in Outbound Hybrid Search scenario Neil says: "There is no refinement yet" and "The roadmap is to improve that service" - where Manas is quick to say "Not as of now", which makes Neil giggle. Feels like this is more on the longer-term roadmap...

Friday, February 28, 2014

Conference Schedule for 2014

This year I will attend the following conferences:
My topic in Zürich is "Custom Indexing Connectors - How to integrate external system into your SharePoint Enterprise Search" (presented in German).

There are some other interesting conferences I will attend if I'm in as a speaker - I try to place a talk about integrating Yammer into SharePoint Search (surprise!).

Meet you there!

Progress and Architecture of the Yammer Search Connector for SharePoint

In the last days I took some time to improve my Yammer Search Connector for SharePoint. I do this partly as preparation for my talk at Collaboration Days in Zürich and partly out of curiosity to see if this can work.

So far everything worked well. The display templates for displaying the SharePoint search results are always more work than expected, especially if you have to write asynchronous JavaScript in SharePoint Designer.

I chose to adapt the Microblog Display Template that is normally used for displaying Newsfeed posts.

This is the result:


Not too bad, isn't it?

You see the similarity to the Newsfeed search results, but this time all data is from Yammer: content, author, creation date, like count, reply count and whether it is a "root" post or not. If it's a root the caption says "Xy yammered..." and for a reply it says "Xy answered...".

The user image is also from Yammer. It will even be displayed if you are not logged in to Yammer.

On the left you see some other data from Yammer being used as refiners: groups, network names and like count. This could easily be extended to topics or result type.

Architecture

Now for the architecture. Here is a diagram:


As mentioned this works quite well. (The Sync-apps use the YamrSync library.)

Here is my reasoning for using a cache in between Yammer and SharePoint:
  • a SharePoint full crawl would hammer the Yammer REST API - hitting API limits and possibly getting some attention because of causing excessive load
  • Yammer is in the cloud, not your local data center - you are using your Internet connection for downloading data, so it can take some time to get all data, especially big documents -> a crawl would take forever and continuous crawls are not supported for non-SharePoint content
The idea is to use a small console application loading data from Yammer over time. This of course has caveats:
  • we essentially duplicate what the SharePoint crawl does (although with more control over how we crawl)
  • we need a cache
  • the cache currently is MongoDB, which is robust and scalable but outside the Microsoft world - would this fit in?

So what do you think?

Is this a road that is worth to go further? Or is this prototype good for its initial purpose - demonstrating how you can integrate external systems into SharePoint Search - but not more? Maybe Yammer would pull the plug anyways.

What would be the minimum requirements to run this in a production environment? And is it possible to meet those requirements?

I'd love to hear your take on this in the comments!

Saturday, February 22, 2014

YamrSync - Yammer REST API wrapper and data synchronization library

My first open source project - a REST API wrapper and data synchronization library for Yammer that can be used from .NET applications. Yay!

Check it out: YamrSync

I need it for my Yammer SharePoint Search Integration project (working title: "One Search to Rule Them All"). More info coming soon. Just the basic thought: with the Yammer API limits in place it is not possible to do a full crawl on a Yammer repository on a regular basis. Furthermore we must assume a slow internet connection (remember we need the documents, too). So we basically use MongoDB as a data cache SharePoint will crawl against. And we need the database for storing user permissions our security pre-trimmer will use.

Saturday, February 8, 2014

Search integration for (on-premises) SharePoint and Yammer - coming soon?

Every now and then I have an idea that sticks in my head and does not go away. When this happens it is often good to talk to people about it because that can bring you back to earth fairly quick.

I am currently preparing for my session at Collaboration Days Zürich which is about "Custom Indexing Connectors - How to integrate external system into your SharePoint Enterprise Search". Thinking about a good use case to demonstrate the power of Indexing Connectors I came to the conclusion that Yammer integration would a perfect showcase. Cloud-only, separated from SharePoint, yet marketed as replacement for the Newsfeed. There is currently no search integration between SharePoint and Yammer.

So why not use the Yammer API to get my hands on its content and feed it to SharePoint? Sounds great, doesn't it? Of course there are challenges - to name two: API limitations and security mapping. But you don't know it for sure until you tried it. There even is a .NET wrapper for the Yammer REST API on Steve Peschkas blog. Great!

But wait a minute - the project on his site is called YammerCrawlSync. This unfortunately thwarted my enthusiasm a bit because it sounds like somebody else might already have something up his sleeve. And maybe something will be presented at SPC 2014 anyway regarding the Yammer-SharePoint search integration.

So what do you think? Will there be an announcement about search integration of SharePoint and Yammer at SPC? And will it include on-premises scenarios?

My bet is: integration is coming, but for now it's cloud-only (apart from a hybrid prototype from Steve Peschka).