Extension attributes
1. Extension attributes
In this video, you'll learn how to add custom attributes to the Doc, Token and Span objects to store custom data.2. Setting custom attributes
Custom attributes let you add any meta data to Docs, Tokens and Spans. The data can be added once, or it can be computed dynamically. Custom attributes are available via the dot-underscore property. This makes it clear that they were added by the user, and not built into spaCy, like token dot text. Attributes need to be registered on the global Doc, Token and Span classes you can import from spacy dot tokens. You've already worked with those in the previous chapters. To register a custom attribute on the Doc, Token or Span, you can use the set extension method. The first argument is the attribute name. Keyword arguments let you define how the value should be computed. In this case, it has a default value and can be overwritten.3. Extension attribute types
There are three types of extensions: attribute extensions, property extensions and method extensions.4. Attribute extensions
Attribute extensions set a default value that can be overwritten. For example, a custom "is color" attribute on the token that defaults to False. On individual tokens, its value can be changed by overwriting it – in this case, True for the token "blue".5. Property extensions (1)
Property extensions work like properties in Python: they can define a getter function and an optional setter. The getter function is only called when you retrieve the attribute. This lets you compute the value dynamically, and even take other custom attributes into account. Getter functions take one argument: the object, in this case, the token. In this example, the function returns whether the token text is in our list of colors. We can then provide the function via the getter keyword argument when we register the extension. The token "blue" now returns True for "is color".6. Property extensions (2)
If you want to set extension attributes on a Span, you almost always want to use a property extension with a getter. Otherwise, you'd have to update *every possible span ever* by hand to set all the values. In this example, the "get has color" function takes the span and returns whether the text of any of the tokens is in the list of colors. After we've processed the doc, we can check different slices of the doc and the custom "has color" property returns whether the span contains a color token or not.7. Method extensions
Method extensions make the extension attribute a callable method. You can then pass one or more arguments to it, and compute attribute values dynamically – for example, based on a certain argument or setting. In this example, the method function checks whether the doc contains a token with a given text. The first argument of the method is always the object itself – in this case, the Doc. It's passed in automatically when the method is called. All other function arguments will be arguments on the method extension. In this case, "token text". Here, the custom "has token" method returns True for the word "blue" and False for the word "cloud".8. Let's practice!
Now it's your turn. Let's add some custom extensions!Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.