Descriptions and
definitions (WIPO)
Data set description
- The data source is the
European Patent Office’s PATSTAT database for all citation
information. Bibliographic
information for international applications is taken mainly from the PATSTAT
database, supplemented by information from WIPO internal databases where
information could be provided which was not available from PATSTAT.
- The data provided is based on
published PCT searches.
- Statistics are presented by
search date up to 2011 Q4, meaning the date on which an international
search report was transmitted to the International Bureau (since this
information is available more consistently than the actual date of
search).
- The date ranges for
statistics take into account data availability. This is constrained by
procedural latency such as time to publication, as well as cut-off dates
for database extracts.
- No filing date constraint is
applied.
Data issues
- Applications with no citation
recorded are removed, as no meaningful international search was carried
out for these applications.
- A small number of patent
citations are without category codes.
- In case of citn_origin = 5 (documents cited during international
search), those citations are considered; otherwise, citations with citn_origin = 0 (documents cited during search) are
considered. Citations with other citn_origin
codes are removed.
- NPL citations with no
category assigned and with ID >= 900000000 are removed, as they don’t
seem to be in the original search reports.
- All citation category codes
recorded in the database for the valid citations are considered.
- Citation language codes for
national patent documents are those recorded in the Patstat
database, citation language codes for PCT documents are assigned using
WIPO’s PCT database as they are more precise. The language codes are
further cleaned up according to information of the authorities who publish
those documents.
- No attempt has been made to
determine the language of publication of non patent literature documents.
Definition
of concepts
Technology
breakdown
- Technology sector and field
are derived from the IPC classes assigned in the international phase
search report or publication.
- The grouping into technology
sector and field is based on a concordance provided by WIPO. (http://www.wipo.int/ipstats/en/statistics/patents/pdf/wipo_ipc_technology.pdf).
- This technology breakdown
includes 8 technology sectors (Electrical engineering, Energy technology,
Instruments, Mechanical engineering, Micro-structural and nano-technology, Other fields and Semiconductors),
which are further broken down into 35 technology fields.
- Multiple IPC classes are
often assigned to applications. For the present statistics, fractional counting
method is applied, that is, an international application and all citations
in its search report are evenly distributed to multiple technological
fields when multiple fields are associated with it.
- IPC class information is not
available for approximately 1% of applications.
Applicant origin
- In general this is the State
in which the first-named applicant is resident (overall, this gives a more
useful indication of origin of the application than the receiving Office
because the International Bureau and regional Offices work for many
States, whereas some States do not themselves operate a receiving Office).
- “Unknown” code is used for a
small percentage of applications.
XY rate (Searches with XY citations)
- XY rate refers to share of
search reports where at least one citation is in the category of X or Y.
- In addition the use of an E
citation is counted as XY if it can be assumed that the E citation is
prejudicial to novelty. This is the case unless the E category is assigned
in combination with A.
Citation
category availability
- PATSTAT does not contain all
citation categories for each citation. The database contains one citation
category per group of categories for each citation. The category groups
are defined as follows:
Group
1
|
X
Y A
|
Group
2
|
P
E
|
Group
3
|
D
|
Group
4
|
O
T L
|
- Only one category from the
same group is selected. The category selected is determined from the order
in the table above. In this way a citation will be categorized as X if the
citation categories in the search report are XY for this citation. Priority for selecting the letter is
according to the ranking of categories left to right within the groups
above, rather than the order of their appearance within the citation in
the international search report (that is, X will be shown even if the
search report lists Y category claims first).
- A maximum of 3 categories is
recorded.
- Citation Category Examples:
Search
report citation
|
Citation
categories present in PATSTAT
|
X,
Y, A, P
|
X,
P
|
Y,
A, P, E
|
Y,
P
|
Y,
X, O, T
|
X,
O
|
X,
P, D, O
|
X,
P, D
|
- This means for example in
row 1 above the Y nor the A is not stored in
PATSTAT.
- In practice it is therefore
possible to determine whether a search has at least one X or Y citation.
It is also possible to correctly count the number of X citations.
- In approximately 20% of
cases it is not possible to correctly count the number of Y categories
used, although it is possible to count the use of Y without an X.
- EPO data has been refined
with an additional internal data source.
A only rate
- A-only rate refers to the
share of search reports where no citation is in the category of X, Y or E.
Y no X rate
- Y no X rate refers to the
share of search reports where at least one citation is in the category of
Y and there is no X citation.
Search date
- The date when the search
report is transmitted to WIPO (the actual date of search is not available
in all cases).
Patent Literature/Non-Patent Literature
- Citations in PATSTAT are categorised into patent literature and non-patent
literature.
- A citation is considered
patent literature if it relates to patent abstracts provided by various
providers.
- Less information is
available for NPL citations. For example, the language of a NPL citation
is not available.
Non official language
This is used for counting patent
citations that are not in an official language of the respective ISA:
ISA
|
Official
language
|
AT
|
German
|
AU
|
English
|
BR
|
Portuguese
|
CA
|
English
|
CA
|
French
|
CN
|
Chinese
|
EP
|
German
|
EP
|
English
|
EP
|
French
|
ES
|
Spanish
|
FI
|
Finnish
|
JP
|
Japanese
|
KR
|
Korean
|
RU
|
Russian
|
SE
|
Swedish
|
US
|
English
|
XN
|
Danish
|
XN
|
Icelandic
|
XN
|
Norwegian
|
- The statistics are based on
the actual official languages of the Office, but can easily be redefined
to reflect any set of core languages which an Offices considers to be
useful in assessing how effective its processes may be at discovering
prior art beyond those languages.
Publication authority (of citation)
- This is the patent
organization who published a citation document.
- It is normally a national patent
office, a regional office such as the EPO, or WIPO.
Processing Authority (of citation)
- Generally processing
authority is assigned from the publication authority of the citation.
- For WO publications, the
international search authority is chosen to indicate which office
processed the cited patent publication.
This gives an indication of the nature of the publication which be more useful for some purposes than simply the number
of WO citations, which may be in any of 10 languages.