Sentence
stringlengths
30
302
ID
int32
0
49
Label
class label
2 classes
No regional side effects were noted.
0
2not ADE-related
We describe the case of a 10-year-old girl with two epileptic seizures and subcontinuous spike-waves during sleep, who presented unusual side-effects related to clobazam (CLB) monotherapy.
1
2not ADE-related
The INR should be monitored more frequently when bosentan is initiated, adjusted, or discontinued in patients taking warfarin.
2
2not ADE-related
After the first oral dose of propranolol, syncope developed together with atrioventricular block.
3
1ADE-related
As termination was not an option for the family, the patient was extensively counseled and treated with oral ganciclovir.
4
2not ADE-related
Pulses have been given for periods up to three years without evident toxicity.
5
2not ADE-related
CONCLUSION: Pancreatic enzyme intolerance, although rare, would be a major problem in the management of patients with CF.
6
2not ADE-related
The treatment of Toxoplasma encephalitis in patients with acquired immunodeficiency syndrome.
7
2not ADE-related
A challenge with clozapine was feasible and showed no clinical symptoms of eosinophilia.
8
2not ADE-related
OBJECTIVE: To describe onset of syndrome of inappropriate antidiuretic hormone (SIADH) associated with vinorelbine therapy for advanced breast cancer.
9
1ADE-related
These results indicate that the hyponatremia in this case was due to SIADH and that SIADH was caused by an increased release of vasopressin probably because of the antiviral drug (acyclovir) or infection of varicella zoster virus (VZV) in a single dermatome.
10
2not ADE-related
Macular infarction after endophthalmitis treated with vitrectomy and intravitreal gentamicin.
11
1ADE-related
These cases were considered unusual in light of the short delay of their onset after initiation of immunosuppressive therapy and their fulminant course: 3 of these patients died of PCP occurring during the first month of treatment with prednisone.
12
1ADE-related
In 1991 the patient were found to be seropositive for HCV antibodies as detected by the ELISA method and confirmed by the RIBA method.
13
2not ADE-related
MRI has a high sensitivity and specificity in the diagnosis of osteonecrosis and should be used when this condition is suspected.
14
2not ADE-related
Treatment of silastic catheter-induced central vein septic thrombophlebitis.
15
2not ADE-related
These organisms have occasionally been reported as a cause of serious infections in man but have not been reported as a cause of shunt infection.
16
2not ADE-related
NEH must be considered in lupus patients receiving cytotoxic agents to avoid inappropriate use of corticosteroids or antibiotics in this self-limited condition.
17
2not ADE-related
The patient had no skin reactions for the next 12 mo, with the exception of injection-site papules.
18
2not ADE-related
Of the 16 patients, including the 1 reported here, only 3 displayed significant shortening of the agranulocytic period after treatment.
19
2not ADE-related
A closer look at septic shock.
20
2not ADE-related
A 24- to 48-h course of large-dose glucocorticoid therapy is often used in the acute management of spinal cord injury.
21
2not ADE-related
CT-scan disclosed right ethmoid sinusitis that spread to the orbit after surgery.
22
2not ADE-related
Sotalol-induced bradycardia reversed by glucagon.
23
1ADE-related
The cases are important in documenting that drug-induced dystonias do occur in patients with dementia, that risperidone appears to have contributed to dystonia among elderly patients, and that the categorization of dystonic reactions needs further clarification.
24
1ADE-related
No abnormalities were identified on review of collection and processing records.
25
2not ADE-related
A case study is presented of a licensed practical nurse who developed persistent contact dermatitis.
26
2not ADE-related
An encephalopathy and cardiomyopathy developed in a seventeen-year-old girl with chemotherapy-induced renal failure while receiving an intravesical aluminum infusion for hemorrhagic cystitis.
27
1ADE-related
The gold standard for diagnosis is renal biopsy, but it is only rarely performed during the acute phase of the reaction and is not without risk.
28
2not ADE-related
METHODS: We identified three patients who developed skin necrosis and determined any factors, which put them at an increased risk of doing so.
29
2not ADE-related
We describe a patient who developed HUS after treatment with mitomycin C (total dose 144 mg/m2) due to a carcinoma of the ascending colon.
30
1ADE-related
The authors caution that treatment with alprazolam may be complicated by the induction of mania.
31
1ADE-related
We report a case of long lasting respiratory depression after intravenous administration of morphine to a 7 year old girl with haemolytic uraemic syndrome.
32
1ADE-related
Best-corrected visual acuity measurements were performed at every visit.
33
2not ADE-related
Considerable improvement of myasthenic symptoms was seen in all patients within 3-6 months after the initiation of this therapy.
34
2not ADE-related
We present three patients with paradoxical seizures; their serum phenytoin levels were 43.5 mcg/mL, 46.5 mcg/mL and 38.3 mcg/mL.
35
1ADE-related
A patient with psoriasis is described who had an abnormal response to the glucose tolerance test without other evidence of diabetes and then developed postprandial hyperglycemia and glycosuria during a period of topical administration of a corticosteroid cream, halcinonide cream 0.1%, under occlusion.
36
1ADE-related
This report demonstrates the increased risk of complicated varicella associated with the use of corticosteroids, even for a short period of time.
37
2not ADE-related
This case report describes a 13-year-old male with diagnosis of autistic disorder and fetishistic behavior.
38
2not ADE-related
Several hypersensitivity reactions to cloxacillin have been reported, although IgE-mediated allergic reactions to the drug are rare and there is little information about possible tolerance to other semisynthetic penicillins or cephalosporins in patients with cloxacillin allergy.
39
1ADE-related
A 69-year-old male was diagnosed in February 2004 with stage IV extranodal marginal zone B cell lymphoma involving the mediastinal nodes, lung parenchyma and bone marrow with high LDH.
40
2not ADE-related
With serious cases, however, conventional treatment may not allow sufficient time at depth for the complete resolution of manifestations because of the need to avoid pulmonary oxygen toxicity which is associated with a prolonged period of breathing compressed air.
41
2not ADE-related
Thrombolytic treatment is advocated for critical patients unless emergency institution of cardio pulmonary bypass is required and/or indicated.
42
2not ADE-related
IMPLICATIONS: Dexmedetomidine, an alpha(2)-adrenoceptor agonist, is indicated for sedating patients on mechanical ventilation.
43
2not ADE-related
Remarkable findings on initial examination were facial grimacing, flexure posturing of both upper extremities, and 7-mm, reactive pupils.
44
2not ADE-related
Acute promyelocytic leukemia after living donor partial orthotopic liver transplantation in two Japanese girls.
45
2not ADE-related
The mechanism by which sunitinib induces gynaecomastia is thought to be associated with an unknown direct action on breast hormonal receptors.
46
1ADE-related
Early detection of these cases has practical importance since the identification and elimination of the causative drug is essential for therapy success.
47
2not ADE-related
CONCLUSIONS: These results suggest that clozapine may cause TD; however, the prevalence is low and the severity is relatively mild, with no or mild self-reported discomfort.
48
1ADE-related
METHODS: This study is a case report description.
49
2not ADE-related

Dataset Card for RAFT

Dataset Summary

The Real-world Annotated Few-shot Tasks (RAFT) dataset is an aggregation of English-language datasets found in the real world. Associated with each dataset is a binary or multiclass classification task, intended to improve our understanding of how language models perform on tasks that have concrete, real-world value. Only 50 labeled examples are provided in each dataset.

Supported Tasks and Leaderboards

  • text-classification: Each subtask in RAFT is a text classification task, and the provided train and test sets can be used to submit to the RAFT Leaderboard To prevent overfitting and tuning on a held-out test set, the leaderboard is only evaluated once per week. Each task has its macro-f1 score calculated, then those scores are averaged to produce the overall leaderboard score.

Languages

RAFT is entirely in American English (en-US).

Dataset Structure

Data Instances

Dataset First Example
Ade Corpus V2
Sentence: No regional side effects were noted.
ID: 0
Label: 2
Banking 77
Query: Is it possible for me to change my PIN number?
ID: 0
Label: 23
NeurIPS Impact Statement Risks
Paper title: Auto-Panoptic: Cooperative Multi-Component Architecture Search for Panoptic Segmentation...
Paper link: https://proceedings.neurips.cc/paper/2020/file/ec1f764517b7ffb52057af6df18142b7-Paper.pdf...
Impact statement: This work makes the first attempt to search for all key components of panoptic pipeline and manages to accomplish this via the p...
ID: 0
Label: 1
One Stop English
Article: For 85 years, it was just a grey blob on classroom maps of the solar system. But, on 15 July, Pluto was seen in high resolution ...
ID: 0
Label: 3
Overruling
Sentence: in light of both our holding today and previous rulings in johnson, dueser, and gronroos, we now explicitly overrule dupree....
ID: 0
Label: 2
Semiconductor Org Types
Paper title: 3Gb/s AC-coupled chip-to-chip communication using a low-swing pulse receiver...
Organization name: North Carolina State Univ.,Raleigh,NC,USA
ID: 0
Label: 3
Systematic Review Inclusion
Title: Prototyping and transforming facial textures for perception research...
Abstract: Wavelet based methods for prototyping facial textures for artificially transforming the age of facial images were described. Pro...
Authors: Tiddeman, B.; Burt, M.; Perrett, D.
Journal: IEEE Comput Graphics Appl
ID: 0
Label: 2
TAI Safety Research
Title: Malign generalization without internal search
Abstract Note: In my last post, I challenged the idea that inner alignment failures should be explained by appealing to agents which perform ex...
Url: https://www.alignmentforum.org/posts/ynt9TD6PrYw6iT49m/malign-generalization-without-internal-search...
Publication Year: 2020
Item Type: blogPost
Author: Barnett, Matthew
Publication Title: AI Alignment Forum
ID: 0
Label: 1
Terms Of Service
Sentence: Crowdtangle may change these terms of service, as described above, notwithstanding any provision to the contrary in any agreemen...
ID: 0
Label: 2
Tweet Eval Hate
Tweet: New to Twitter-- any men on here know what the process is to get #verified?...
ID: 0
Label: 2
Twitter Complaints
Tweet text: @HMRCcustomers No this is my first job
ID: 0
Label: 2

Data Fields

The ID field is used for indexing data points. It will be used to match your submissions with the true test labels, so you must include it in your submission. All other columns contain textual data. Some contain links and URLs to websites on the internet.

All output fields are designated with the "Label" column header. The 0 value in this column indicates that the entry is unlabeled, and should only appear in the unlabeled test set. Other values in this column are various other labels. To get their textual value for a given dataset:

# Load the dataset
dataset = datasets.load_dataset("ought/raft", "ade_corpus_v2")
# First, get the object that holds information about the "Label" feature in the dataset.
label_info = dataset.features["Label"]
# Use the int2str method to access the textual labels.
print([label_info.int2str(i) for i in (0, 1, 2)])
# ['Unlabeled', 'ADE-related', 'not ADE-related']

Data Splits

There are two splits provided: train data and unlabeled test data.

The training examples were chosen at random. No attempt was made to ensure that classes were balanced or proportional in the training data -- indeed, the Banking 77 task with 77 different classes if used cannot fit all of its classes into the 50 training examples.

Dataset Train Size Test Size
Ade Corpus V2 50 5000
Banking 77 50 5000
NeurIPS Impact Statement Risks 50 150
One Stop English 50 516
Overruling 50 2350
Semiconductor Org Types 50 449
Systematic Review Inclusion 50 2243
TAI Safety Research 50 1639
Terms Of Service 50 5000
Tweet Eval Hate 50 2966
Twitter Complaints 50 3399
Total 550 28712

Dataset Creation

Curation Rationale

Generally speaking, the rationale behind RAFT was to create a benchmark for evaluating NLP models that didn't consist of contrived or artificial data sources, for which the tasks weren't originally assembled for the purpose of testing NLP models. However, each individual dataset in RAFT was collected independently. For the majority of datasets, we only collected them second-hand from existing curated sources. The datasets that we curated are:

  • NeurIPS impact statement risks
  • Semiconductor org types
  • TAI Safety Research

Each of these three datasets was sourced from our existing collaborators at Ought. They had used our service, Elicit, to analyze their dataset in the past, and we contact them to include their dataset and the associated classification task in the benchmark. For all datasets, more information is provided in our paper. For the ones which we did not curate, we provide a link to the dataset. For the ones which we did, we provide a datasheet that elaborates on many of the topics here in greater detail.

For the three datasets that we introduced:

  • NeurIPS impact statement risks The dataset was created to evaluate the then new requirement for authors to include an "impact statement" in their 2020 NeurIPS papers. Had it been successful? What kind of things did authors mention the most? How long were impact statements on average? Etc.
  • Semiconductor org types The dataset was originally created to understand better which countries’ organisations have contributed most to semiconductor R&D over the past 25 years using three main conferences. Moreover, to estimate the share of academic and private sector contributions, the organisations were classified as “university”, “research institute” or “company”.
  • TAI Safety Research The primary motivations for assembling this database were to: (1) Aid potential donors in assessing organizations focusing on TAI safety by collecting and analyzing their research output. (2) Assemble a comprehensive bibliographic database that can be used as a base for future projects, such as a living review of the field.

For the following sections, we will only describe the datasets we introduce. All other dataset details, and more details on the ones described here, can be found in our paper.

Source Data

Initial Data Collection and Normalization

  • NeurIPS impact statement risks The data was directly observable (raw text scraped) for the most part; although some data was taken from previous datasets (which themselves had taken it from raw text). The data was validated, but only in part, by human reviewers. Cf this link for full details:
  • Semiconductor org types We used the IEEE API to obtain institutions that contributed papers to semiconductor conferences in the last 25 years. This is a random sample of 500 of them with a corresponding conference paper title. The three conferences were the International Solid-State Circuits Conference (ISSCC), the Symposia on VLSI Technology and Circuits (VLSI) and the International Electron Devices Meeting (IEDM).
  • TAI Safety Research We asked TAI safety organizations for what their employees had written, emailed some individual authors, and searched Google Scholar. See the LessWrong post for more details: https://www.lesswrong.com/posts/4DegbDJJiMX2b3EKm/tai-safety-bibliographic-database

Who are the source language producers?

  • NeurIPS impact statement risks Language generated from NeurIPS 2020 impact statement authors, generally the authors of submission papers.
  • Semiconductor org types Language generated from IEEE API. Generally machine-formatted names, and title of academic papers.
  • TAI Safety Research Language generated by authors of TAI safety research publications.

Annotations

Annotation process

  • NeurIPS impact statement risks Annotations were entered directly into a Google Spreadsheet with instructions, labeled training examples, and unlabeled testing examples.
  • Semiconductor org types Annotations were entered directly into a Google Spreadsheet with instructions, labeled training examples, and unlabeled testing examples.
  • TAI Safety Research N/A

Who are the annotators?

  • NeurIPS impact statement risks Contractors paid by Ought performed the labeling of whether impact statements mention harmful applications. A majority vote was taken from 3 annotators.
  • Semiconductor org types Contractors paid by Ought performed the labeling of organization types. A majority vote was taken from 3 annotators.
  • TAI Safety Research The dataset curators annotated the dataset by hand.

Personal and Sensitive Information

It is worth mentioning that the Tweet Eval Hate, by necessity, contains highly offensive content.

  • NeurIPS impact statement risks The dataset contains authors' names. These were scraped from publicly available scientific papers submitted to NeurIPS 2020.
  • Semiconductor org types N/A
  • TAI Safety Research N/A

Considerations for Using the Data

Social Impact of Dataset

  • NeurIPS impact statement risks N/A
  • Semiconductor org types N/A
  • TAI Safety Research N/A

Discussion of Biases

  • NeurIPS impact statement risks N/A
  • Semiconductor org types N/A
  • TAI Safety Research N/A

Other Known Limitations

  • NeurIPS impact statement risks This dataset has limitations that should be taken into consideration when using it. In particular, the method used to collect broader impact statements involved automated downloads, conversions and scraping and was not error-proof. Although care has been taken to identify and correct as many errors as possible, not all texts have been reviewed by a human. This means it is possible some of the broader impact statements contained in the dataset are truncated or otherwise incorrectly extracted from their original article.
  • Semiconductor org types N/A
  • TAI Safety Research Don't use it to create a dangerous AI that could bring the end of days.

Additional Information

Dataset Curators

The overall RAFT curators are Neel Alex, Eli Lifland, and Andreas Stuhlmüller.

  • NeurIPS impact statement risks Volunteers working with researchers affiliated to Oxford's Future of Humanity Institute (Carolyn Ashurst, now at The Alan Turing Institute) created the impact statements dataset.
  • Semiconductor org types The data science unit of Stiftung Neue Verantwortung (Berlin).
  • TAI Safety Research Angelica Deibel and Jess Riedel. We did not do it on behalf of any entity.

Licensing Information

RAFT aggregates many other datasets, each of which is provided under its own license. Generally, those licenses permit research and commercial use.

Dataset License
Ade Corpus V2 Unlicensed
Banking 77 CC BY 4.0
NeurIPS Impact Statement Risks MIT License/CC BY 4.0
One Stop English CC BY-SA 4.0
Overruling Unlicensed
Semiconductor Org Types CC BY-NC 4.0
Systematic Review Inclusion CC BY 4.0
TAI Safety Research CC BY-SA 4.0
Terms Of Service Unlicensed
Tweet Eval Hate Unlicensed
Twitter Complaints Unlicensed

Citation Information

[More Information Needed]

Contributions

Thanks to @neel-alex, @uvafan, and @lewtun for adding this dataset.

Downloads last month
5,795
Edit dataset card

Space using ought/raft 1