NIH Internal Consultation Meeting, July 31, 2007

Meeting Summary

Opening Remarks

Dr. Larry Tabak and Dr. Jeremy Berg, Co-Chairs of the NIH Steering Committee Ad Hoc Working Group on Peer Review

The increasing breadth, complexity, and interdisciplinary nature of modern science have created challenges for peer review, a key component of a larger system: the biomedical and behavioral research enterprise. To adapt to the evolving scientific landscape, it is important for the NIH to ensure that the processes used to support science are as efficient and effective for applicants, reviewers, and NIH staff as possible with available resources. It is critical that the NIH continue to draw the most talented reviewers.

In September, 2006, NIH leadership resolved that enhancing the NIH peer review system would be a top agency priority. Subsequent NIH staff brainstorming sessions laid the foundation for the current examination process that is a partnership between the NIH and the scientific community. This involves both an external working group (the Advisory Council to the Director (NIH) Working Group on Peer Review co-chaired by Dr. Keith Yamamoto and Dr. Larry Tabak that will also select a series of science liaisons for further outreach) and an internal working group (the Steering Committee Working Group on Peer Review co-chaired by Dr. Larry Tabak and Dr. Jeremy Berg). In parallel, the Center for Scientific Review (CSR) has launched several peer review pilots and initiatives; these efforts will continue along with the analyses of prior NIH “experiments” with peer review practices. The core questions to be analyzed during the NIH peer review self-study process include:

  • Is the system currently used by NIH to support biomedical and behavioral research optimal?
  • Do the best scientists/scientific ideas score highest in review?
  • Are we engaging the best reviewers?
  • Should we increase program flexibility to enhance peer review? If so, how?
  • Should we increase review flexibility to enhance peer review? If so, how?

Other steps in the peer review self-study process include:

  • Posting of a Request for Information (RFI) and an interactive Web site for soliciting opinions (July to September 2007)
  • Advisory Committee to the Director Working Group holds a series of 5 regional town meetings (July to October 2007)
  • Steering Committee Working Group holds consultative meetings within the NIH and creates a Web-based survey for soliciting NIH staff opinions(July to October 2007)
  • Dr. Tabak and Dr. Berg provide updates to various Institute/Centers (IC) Advisory Councils (Fall 2007)
  • NIH leadership considers input from the RFI and both Working Groups and determines next steps, including pilots (February 2008)
  • NIH staff design and initiate pilots and associated evaluations (March 2008)

Once plans are in place, the NIH will hold briefings for NIH staff, scientific societies, trade press, advocacy organizations, and Congress. Successful pilots will be expanded, commencing development and implementation of the new NIH Peer Review Policy in 2008.

Dr. Tabak and Dr. Berg provided key points that have emerged from other peer review self-study meetings to date:

Key messages from NIH Consultation Meeting I (July 18, 2007, NIH campus):

  • Peer review changes cannot address the fundamental imbalance between the number of applications received and the number of awards possible.
  • The first stage of peer review is only one component of a larger system that includes Advisory Council Review and IC funding decisions.
  • Peer review has become increasingly complex, and pilot studies need to address impact on applicants, reviewers, and NIH staff.
  • Recruiting and training optimal reviewers is key.
  • The structure of applications and review criteria should be better aligned.
  • Single scores do not reflect the totality of the quality of an application.
  • Applicant populations at distinct career stages may benefit from tailored review panels and distinct funding mechanisms.

Key messages from the External Working Group Meeting I (July 30, 2007, Washington DC):

  • Despite understanding the stress of a limited funding climate, the scientific community has considerable anxiety about recent trends in the re-application process (many see a bias against original (unamended) applications).
  • There is concern about the training of reviewers and study section chairs, and about possible reluctance to address poor performance.
  • Some in the community feel that new science may be “crowded out” due to perceived conservatism in a system in which peer reviewers are also competitors for research funding.
  • The community voices a plea to consider novel review criteria and processes for interdisciplinary research.
  • There is a need for the NIH to perform “science on science,” and that psychometrics and “decision science” should inform this process.
  • The current multitude of NIH grant mechanisms—some of which are used differently even within an individual NIH IC—confuses NIH grant applicants.
  • Some in the community have suggested that the NIH re-visit policies and practices about the percent support of an investigator’s salary paid by an NIH grant.
  • Some in the community have proposed relaxing indirect cost rate control to redistribute costs.

Open Discussion

Challenges and Solutions for the NIH System of Research Support

Relationship of the NIH with Institutions

There is debate about how the NIH should most responsibly spend the public research investment. While NIH grants are considered funding-in-aid, many investigators (their institutions) rely on NIH funds for nearly full salary support. Some question the fact that NIH support goes to institutional expenses (salary, start-up costs) and not to the conduct of research: Should the NIH negotiate the terms and conditions of awards? Proposals to administer block grants to institutions (either untargeted or for subpopulations, such as unsuccessful applicants or new investigators) have been suggested but not well received, mainly because establishing fairness in the distribution of such funds could be especially difficult to evaluate and enforce.

Grant Characteristics

In the 1960s and 1970s, most scientists with NIH funding had a single grant, which was sufficient to drive and sustain their research livelihood. Today, the environment has changed, with multiple, staggered sources of funding often being necessary. Altering grant duration—longer for experienced investigators—may be one way to reduce reviewer load while providing researcher security. Potential downsides of this approach include reduced success after time out from the application process and salary effects related to grants. Since the NIH has been funding MERIT and Javits awards for many years, analysis of these data may inform the issue. Some think that the NIH should exert more authority in curtailing awarded grants—for poor performance or for straying too far from research aims.

Review Practices

Review criteria that retrospectively evaluate a person (or a laboratory, as is done in the NIH intramural program), instead of a project, may increase flexibility in review and enable researchers to adapt more readily and quickly to evolving science. This approach has appeared to be quite successful thus far in the reviews of the NIH Director’s Pioneer and New Innovator Awards, but it is difficult to imagine implementing the strategy on a larger scale due to logistical constraints.

Tailoring grants to certain applicant populations (such as new investigators) may address the potentially negative funding impacts common to career transition points, although some feel that the NIH may be going to far to protect new investigators at the expense of mid-career scientists. More data is needed to evaluate the effects of the current funding climate on these researchers.

Joint review panels may provide the means to distribute expertise for the review of interdisciplinary projects, akin to how ICs receive secondary assignments on grants. Interdisciplinary grant reviews face an additional complexity, in that any funded project must have an NIH “home,” but many NIH ICs are organized by discipline.

Peer Review Criteria and Scoring

Application and Scoring Issues

Many in the applicant, review, and NIH staff communities agree that a single score doesn’t always effectively communicate the strengths and weaknesses of a proposal. Yet, ultimately, since one decision must be made it is important to find the critical pieces of information that can best reflect the research quality and feasibility of an application.

In the recently introduced NIGMS EUREKA award program, review criteria will prioritize innovation and potential impact above all else. Evaluation of the outcomes of this program will be useful for the NIH peer review self-study exercise. Related proposals suggest multi-step or multi-component reviews, in which innovation/impact and technical merit are reviewed in distinct phases. Another possibility is contract-style review, in which distinct criteria are separately weighted, assessed, and reported to applicants, reviewers, and program staff.

Some have suggested fairly radical moves such as deleting methodological information entirely from application content. Many believe this strategy could have unintended consequences such as favoring senior investigators and/or making it difficult to distinguish what is doable from what is not. Moreover, in many areas of science choosing appropriate methodology is a critical part of study design requiring evaluation.

Enhancing Integration and Communication

Better alignment of application structure and review criteria would benefit all participants in the peer review process. In general, many feel that improved communication to and from applicants and reviewers would facilitate the review process and potentially improve the accuracy and fairness of reviews. SRAs should encourage reviewers to be clear, explicit, and complete in their comments to applicants. Enhanced reviewer training may also clarify roles and processes within a study section, and providing constructive feedback to reviewers would likely be useful.

All facets of the review process would benefit from the calibration of language, perhaps via numerical sub-scores for different criteria or elements. While many promote the idea of sharing more information about an individual review with both the applicant and IC program staff, others counter that this will jeopardize the validity and unbiased nature of the review. Others noted that providing information to applicants in its rawest form may be the best way to avoid misunderstandings. Additional suggestions for helping applicants improve grant-writing skills include providing adjectival scores or one-sentence descriptions for unsuccessful applications, and supplying score variability and range data.

There was little support for a proposed idea to include applicants more directly in their own review (e.g., via the Internet or telephone). Most thought this would add undue confusion and subjectivity to a review, without a sufficiently positive impact on the applicant.

Core Values of the NIH Peer Review Process

Roles of Program and Review Staff

There is a substantial lack of awareness within the scientific community that both levels of peer review are considered advisory, and that funding decisions ultimately lie with an IC. While peer review is only one component of the biomedical and behavioral research system, it cannot be completely severed from funding. Simply by unscoring an application, for example, reviewers have essentially made a funding decision. Similarly, assigning a percentile score to an application often has direct funding implications. The balance of influence between reviewers, program staff, and advisory councils varies across the NIH, and ICs use a variety of reasons to “skip” or “fund out of order.” Some of these include IC or NIH priorities, public health or societal need, portfolio balance and workforce-related concerns.

Removing scores and binning grants, as is done at the National Science Foundation, may be able to reduce complexity and diminish ambiguity of the current 41-point scoring system, but many view this process as shifting decisional power from review to program staff. Over the years, various binning scenarios have been tried by NIH ICs, with mixed results. The NIH should study these cases to help design pilots for evaluating the pros and cons of binning, quartile-ranking, and similar approaches.

Reviewer Qualifications

Qualified and enthusiastic reviewers are the linchpin of the peer review system. Various suggestions have surfaced on how to recruit the best reviewers, including making study section service mandatory or more flexible (e.g., rotating among sections), recruiting outside the NIH, including the public under certain circumstances, and providing time-based or financial incentives. The 10-fold growth in reviewers used over the past 20 years is illustrative of how workload volume has diluted the experience of the reviewer pool. There simply are not 10 times as many senior researchers as there were 20 years ago. Many feel that cultural change is needed to transform study section service from a chore to a desirable and intellectually stimulating activity.

There is considerable debate about the specific characteristics of the “best” reviewers. For example, while most scientists agree that content experts—peers—are the best equipped to review grants in their area of expertise, it is nonetheless the case that the people who establish paradigms (often the leaders in a given field) have the most to lose from paradigm shifts. Thus, in this manner of thinking, does peer review consisting solely of narrow subject matter experts inadvertently discourage new ideas?

Broad-thinking scientists may be especially useful in identifying innovative and high-impact science in fields outside their own, but there are a number of pressures that force the use of subject-matter experts. Aside from having sufficient knowledge about scientific content, most agree that any good reviewer should be communicative, timely, willing to speak out with conviction, and open to changing a score based on a persuasive argument from another reviewer.

There is general consensus that a mix of ages and experience is best on a review panel, although this assumption has not been rigorously tested. In addition, requirements may vary based upon the nature of the science under review. The NIH could amass data from existing reviews to begin to address this issue and/or design pilots. The NIH should use caution in drafting large scale policy pilots and decisions based on a very small numbers of “extreme” examples (e.g., 75-member review panels). Consulting the science of group dynamics would also inform these efforts.

Key Messages from the July 31, 2007 NIH Consultation II Meeting:

  • Recruitment and education of qualified reviewers is critical to the health of the research enterprise.
  • Optimal reviewers should be communicative, open-minded, and willing to speak out with conviction.
  • Interdisciplinary research projects have unique review needs.
  • Separating review criteria may help reviewers identify innovation and potential impact.
  • Selected peer review “experiments” to identify best practices can be done with existing NIH data.
  • Evaluation of existing NIH peer review practices requires defining and articulating optimal review outcomes aside from priority scores.
  • Improved communication between applicants, reviewers, and NIH staff would enhance review.
  • Reviewing a person vs. a project has many advantages but may be impractical on a large scale.
  • Extending grant funding for senior investigators may reduce review burden but may also have untoward consequences.
  • Practice and opinion vary across NIH regarding the respective roles of review and program staff in the overall peer review process.
This page was last reviewed on August 17, 2007.
skip main navigation National Institutes of Health - Transforming Health Through Discovery U.S. Department of Health and Human Services Health Information Page NIH Grants News and Events Research Institutes and Centers About NIH