csv.Sniffer regular expression has significant backtracing

# Bug report

### Bug description:

You can pass strings to `csv.Sniffer` that can generate significant Regex backtracing and processing time. For example

```python
import csv
import time
 
NUM_ITERATIONS=200
# Example test str "","",""0""0
test_str = '"",'*NUM_ITERATIONS + '"'*NUM_ITERATIONS + '0' + '"'*NUM_ITERATIONS + '0'
print(test_str)
 
t0 = time.time()
 
dialect = csv.Sniffer().sniff(test_str)
 
t1 = time.time()
print(f"{t1-t0}")
```

Some example runs

```
NUM_ITERATIONS Running Time (seconds)
1              0.0008
10             0.0030
100            254.32
```

I've checked against different versions of Python and they all return similar results.

```
For input NUM_ITERATIONS 100 above
Version       : Running Time (Seconds)
Python 3.8.16 : 254
Python 3.9.16 : 250
Python 3.10.9 : 319
Python 3.11.1 : 236
```

This issue lies in this Regex for finding double quoted format https://github.com/python/cpython/blob/b303d3ad3e80e1d9b3befe6650f61f38b72179a4/Lib/csv.py#L274

I've done some testing and a zero length lookahead assertion (or atomic group) you can get a significant performance improvement

``` python
r"((%(delim)s)|^)\W*%(quote)s(?=(?P<zero>[^%(delim)s%(quote)s\n]*))(?P=zero)%(quote)s[^%(delim)s\n]*%(quote)s\W*((%(delim)s)|$)" % \
```

![image](https://github.com/python/cpython/assets/5122866/936d9f53-f079-496f-b4dc-7943d18ad42c)




### CPython versions tested on:

3.8, 3.9, 3.10, 3.11

### Operating systems tested on:

Linux


### Linked PRs
* gh-109639

Nov	DEC	Jan
	10
2024	2025	2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

csv.Sniffer regular expression has significant backtracing #109638

Bug report

Bug description:

CPython versions tested on:

Operating systems tested on:

Linked PRs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

csv.Sniffer regular expression has significant backtracing #109638

Description

Bug report

Bug description:

CPython versions tested on:

Operating systems tested on:

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions