Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
FYYFU committed Dec 21, 2023
1 parent 06119fe commit 9e57b30
Showing 1 changed file with 6 additions and 5 deletions.
11 changes: 6 additions & 5 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -126,12 +126,12 @@ <h1 class="title is-1 publication-title">Safety Alignment in NLP Tasks: <br>Weak


<!-- Teaser video-->
<!--
<section class="hero teaser">
<!--<section class="hero teaser">
<div class="container is-max-desktop">
<div class="hero-body">
<video poster="" id="tree" autoplay controls muted loop height="100%">
<video poster="" id="tree" autoplay controls muted loop height="100%"> -->
<!-- Your video here -->
<!--
<source src="static/videos/banner_video.mp4"
type="video/mp4">
</video>
Expand All @@ -140,7 +140,8 @@ <h2 class="subtitle has-text-centered">
</h2>
</div>
</div>
</section> -->
</section> -->

<!-- End teaser video -->

<!-- Paper abstract -->
Expand All @@ -151,7 +152,7 @@ <h2 class="subtitle has-text-centered">
<h2 class="title is-3">Abstract</h2>
<div class="content has-text-justified">
<p>
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Proin ullamcorper tellus sed ante aliquam tempus. Etiam porttitor urna feugiat nibh elementum, et tempor dolor mattis. Donec accumsan enim augue, a vulputate nisi sodales sit amet. Proin bibendum ex eget mauris cursus euismod nec et nibh. Maecenas ac gravida ante, nec cursus dui. Vivamus purus nibh, placerat ac purus eget, sagittis vestibulum metus. Sed vestibulum bibendum lectus gravida commodo. Pellentesque auctor leo vitae sagittis suscipit.
Recent developments in balancing the usefulness and safety of Large Language Models (LLMs) have raised a critical question: Are mainstream NLP tasks adequately aligned with safety consideration? Our study, focusing on safety-sensitive documents obtained through adversarial attacks, reveals significant disparities in the safety alignment of various NLP tasks. For instance, LLMs can effectively summarize malicious long documents but often refuse to translate them. This discrepancy highlights a previously unidentified vulnerability: attacks exploiting tasks with weaker safety alignment, like summarization, can potentially compromise the integraty of tasks traditionally deemed more robust, such as translation and question-answering (QA). Moreover, the concurrent use of multiple NLP tasks with lesser safety alignment increases the risk of LLMs inadvertently processing harmful content. We demonstrate these vulnerabilities in various safety-aligned LLMs, particularly Llama2 models and GPT-4, indicating an urgent need for strengthening safety alignments across a broad spectrum of NLP tasks.
</p>
</div>
</div>
Expand Down

0 comments on commit 9e57b30

Please sign in to comment.