Single Sign-on access to the CLARIN-D infrastructure

The single sign-on (SSO) infrastructure enables CLARIN-D users to access electronic resources from several institutions without applying for a large number of individual logins and the need to administer these logins. A single username/password combination, usually provided by the users home institution, makes sure that the scientific community gains access to certain distributed resources while the general public does not.

This is especially useful in the case of data with legal restrictions. These can include privacy issues in the case of experimental data from psycholinguistic experiments or copyright restrictions in the case of corpus and computational linguistics, ancient history, archaeology and other fields of research. In case a CLARIN-D user wants to grant others access to legally restricted resources, the SSO infrastructure will be of great use as well. Even the case where only certain groups in the scientific community are legally allowed to access a resource are handled by the SSO infrastructure. If for example participants in a psycholinguistic experiment agreed to have their data shared with PhD candidates and researchers, but not undergraduate students, the SSO infrastructure can ensure this legal restriction is enforced.

To gain access to resources, the CLARIN-D user's home institution has to maintain an identity provider (IdP) that stores username/password combinations (or other means of authentication information used with smartcards etc.) as well as additional attributes about a user. Most often this will be the Shibboleth software with attribute representation using SAML (security assertion markup language). The set of attributes typically comprises the affiliation status within the institution, email address or an anonymized unique identifier. To learn whether an institution runs an IdP and get their login information, users should ask local IT services.

In case the CLARIN-D user's home institution does not run a SAML IdP and will not be able to do so in the mid term, CLARIN-D has a fallback solution. Users can register with the CLARIN IdP to access resources until their home institution has deployed their own IdP.

Once login information is provided either from a home institution or from the CLARIN IdP fallback solution, resources protected by the SSO infrastructure can be accessed. If the user is not legally allowed to access a resource (e.g. because the user is an undergraduate student while the resource can be viewed only by senior researchers) the hosting institution can deny access even for correct authentifications.

To protect resources that are distributed using the SSO infrastructure, the CLARIN-D user's home institution has to run a SAML service provider. A service provider (SP) can be configured according to the conditions which qualify a person to view a resource. These could be their home insitution (e.g. researchers and undergraduates from University A can access the resource, undergraduates from University B cannot), the status within their institution (e.g. undergraduate student, senior researcher), or some other criteria. In the future, access can even be restricted to particular users. To learn whether an institution runs an SP and for steps to protect resources, users should ask their local IT service.

In case the CLARIN-D user's home institution does not run a SAML SP and will not be able to do so in the mid term, CLARIN-D centers can be contacted to explore the possibility of hosting and protecting resources with the server infrastructure they provide.

The IdP and SP of an institution are expected to be part of the SAML federation built by the national research infrastructure organization. In Germany this is the Deutsches Forschungsnetz (DFN) and the DFN-AAI. It might make sense to additionally deploy a discovery service (DS) which assists users in choosing an appropriate IdP. Although the central CLARIN DS can be used a DS at the home instituation will come in handy in case the central service fails or has a malfunction.

Figure 4.1, “Authentication sequence” shows the communication flow running in the background when a user authenticates to access a resource. When a user request a resource from the SP, it responds with a list of accepted IdPs in the form of a DS. The user selects her home institution and authenticates against the home institution's IdP. If the authentication is successful, the IdP tells the SP so and the resource can be accessed (provided the user authorized to use this resource).