Skip to content

Improvement: new TNM regex#366

Draft
LucasDedieu wants to merge 30 commits intomasterfrom
tnm_new_regex
Draft

Improvement: new TNM regex#366
LucasDedieu wants to merge 30 commits intomasterfrom
tnm_new_regex

Conversation

@LucasDedieu
Copy link
Copy Markdown
Collaborator

@LucasDedieu LucasDedieu commented Jan 24, 2025

Description

Add a new TNM regex that outperforms the old one. By default, eds.tnm will use the new regex pattern, but the old one will remain accessible.

Installation:

 pip install git+https://github.com/aphp/edsnlp.git@tnm_new_regex

Code example:

import edsnlp, edsnlp.pipes as eds
from edsnlp.pipes.ner.tnm.patterns_new import tnm_pattern_new
from edsnlp.pipes.ner.tnm.patterns import tnm_pattern

text = "Mise à jour de la classification : T3 N1b M0."

# Old 
nlp = edsnlp.blank("eds")
nlp.add_pipe(eds.tnm(pattern=tnm_pattern))
print(nlp(text).ents)
# Out: ()

# New
nlp_new = edsnlp.blank("eds")
nlp_new.add_pipe(eds.tnm(pattern=tnm_pattern_new))
print(nlp_new(text).ents)
# Out: (T3 N1b M0)

Changes

  • patterns_new.py: File containing new tnm regex. Compare to old one add many new sections.
  • patterns.py: Old regex file. Renamed some sections to match new section names used in model.py.
  • tnm.py: Change default pattern to new pattern.
  • test_tnm.py: Change tnp pipe definition to still use old regex.
  • model.py: Remove part of pydantic typing validation to work with both old and new patterns.

TODO

  • model.py: add pydantic good typing
  • test_tnm.py: update unit tests

Checklist

  • [] If this PR is a bug fix, the bug is documented in the test suite.
  • [] Changes were documented in the changelog (pending section).
  • [] If necessary, changes were made to the documentation (eg new pipeline).

@sonarqubecloud
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jan 24, 2025

Coverage Report

NameStmtsMiss∆ MissCover
edsnlp/pipes/ner/tnm/model.py

New missing coverage at line 21 !

     def __str__(self) -> str:
-         return self.value
New missing coverage at line 127 !
         if self.node_prefix:
-             norm.append(f"{self.node_prefix or ''}")
New missing coverage at line 137 !
         if self.metastasis_prefix:
-             norm.append(f"{self.metastasis_prefix or ''}")
New missing coverage at line 142 !
             if self.metastasis_specification:
-                 norm.append(f"{self.metastasis_specification or ''}")
New missing coverage at line 145 !
         if self.pleura:
-             norm.append(f"PL{self.pleura}")
New missing coverage at line 150 !
             if self.resection_specification:
-                 norm.append(f"{self.resection_specification or ''}")
             if self.resection_loc:
New missing coverage at line 152 !
             if self.resection_loc:
-                 norm.append(f"{self.resection_loc or ''}")
Was already missing at line 160
     def __str__(self):
-         return self.norm()
Was already missing at line 184
             )
-             exclude_unset = skip_defaults
New missing coverage at line 216 !
             if isinstance(v, TnmEnum):
-                 d[k] = v.value

13110892.37%
edsnlp/pipes/ner/tnm/tnm.py

New missing coverage at line 129 !

             if clean.strip().lower() in banned_words:
-                 continue
             if (

461197.83%
TOTAL13573268798.03%
Files without new missing coverage
NameStmtsMiss∆ MissCover
edsnlp/utils/typing.py

Was already missing at line 44

     def __get_validators__(cls):
-         yield cls.validate

481097.92%
edsnlp/utils/torch.py

Was already missing at line 102

 def load_pruned_obj(obj, _):
-     return obj
Was already missing at line 118
     def save_align_devices_hook(pickler, obj):
-         pickler.save_reduce(load_align_devices_hook, (obj.__dict__,), obj=obj)
Was already missing at lines 121-128
     def load_align_devices_hook(state):
-         state["execution_device"] = MAP_LOCATION
  ...
-     AlignDevicesHook = None
Was already missing at line 143
             if torch.Tensor in copyreg.dispatch_table:
-                 old_dispatch[torch.Tensor] = copyreg.dispatch_table[torch.Tensor]
             copyreg.pickle(torch.Tensor, reduce_empty)

819088.89%
edsnlp/utils/span_getters.py

Was already missing at lines 73-75

     if span_getter is None:
-         yield doclike[:], None
-         return
     if callable(span_getter):
Was already missing at lines 76-78
     if callable(span_getter):
-         yield from span_getter(doclike)
-         return
     for key, span_filter in span_getter.items():
Was already missing at lines 99-102
         else:
-             for span, group in candidates:
-                 if span.label_ in span_filter:
-                     yield span, group
Was already missing at line 107
     if callable(span_setter):
-         span_setter(doc, matches)
     else:
Was already missing at line 138
     if callable(value):
-         return value
     if isinstance(value, str):
Was already missing at line 187
             elif isinstance(v, str):
-                 new_value[k] = [v]
             elif isinstance(v, list) and all(isinstance(i, str) for i in v):

24110095.85%
edsnlp/utils/resources.py

Was already missing at line 33

     if not verbs:
-         return conjugated_verbs

251096.00%
edsnlp/utils/numbers.py

Was already missing at line 34

     else:
-         string = s
     string = string.lower().strip()
Was already missing at lines 38-41
         return int(string)
-     except ValueError:
-         parsed = DIGITS_MAPPINGS.get(string, None)
-         return parsed

164075.00%
edsnlp/utils/fuzzy_alignment.py

Was already missing at line 70

         if len(other.begins) == 0:
-             return self
         begins = self.unapply(other.begins, side="left")

1911099.48%
edsnlp/utils/filter.py

Was already missing at line 206

     if isinstance(label, int):
-         return [span for span in spans if span.label == label]
     else:

741098.65%
edsnlp/utils/file_system.py

Was already missing at line 39

     if isinstance(filesystem, str):
-         filesystem = fsspec.filesystem(filesystem)
Was already missing at line 50
         if not isinstance(inferred_protocols, (list, tuple, set)):
-             inferred_protocols = [inferred_fs.protocol]
         if not isinstance(filesystem_protocols, (list, tuple, set)):
Was already missing at line 52
         if not isinstance(filesystem_protocols, (list, tuple, set)):
-             filesystem_protocols = [filesystem.protocol]
         assert set(filesystem_protocols) & set(inferred_protocols), (

313090.32%
edsnlp/training/trainer.py

Was already missing at line 59

     if result is None:
-         result = {}
     if isinstance(x, dict):
Was already missing at line 118
             # fmt: off
-             autocast = {
                "fp16": torch.float16, "float16": torch.float16,
Was already missing at lines 379-385
         if self.sub_batch_size and self.sub_batch_size[1] == "splits":
-             data = data.batchify(
  ...
-             data = data.map(lambda b: [nlp.collate(sb, device=device) for sb in b])
         elif self.sub_batch_size:
Was already missing at lines 938-945
                         raise
-                     except Exception:
  ...
-                         raise
Was already missing at lines 972-974
                     ) > grad_max_dev * math.sqrt(grad_var):
-                         spike = True
-                         spikes += 1
                     else:
Was already missing at line 981
                     if spike and grad_dev_policy == "clip_mean":
-                         torch.nn.utils.clip_grad_norm_(
                             grad_params, grad_mean, norm_type=2
Was already missing at line 985
                     elif spike and grad_dev_policy == "clip_threshold":
-                         torch.nn.utils.clip_grad_norm_(
                             grad_params,

35713096.36%
edsnlp/training/loggers.py

Was already missing at line 65

         if self._file is not None:
-             return
         os.makedirs(self.logging_dir, exist_ok=True)
Was already missing at line 102
                 if col not in values and col != "step":
-                     row.append("")
                 else:
Was already missing at line 225
     def tracker(self):
-         return self.printer
Was already missing at line 293
             )
-             logging_dir = env_logging_dir
         assert logging_dir is not None, (

1604097.50%
edsnlp/reducers.py

Was already missing at line 115

     if not hasattr(module, "__file__"):
-         return True
     if module.__file__ is None:
Was already missing at line 117
     if module.__file__ is None:
-         return False
     # Hack to avoid copying the full module dict

682097.06%
edsnlp/processing/spark.py

Was already missing at line 50

         getActiveSession = SparkSession.getActiveSession
-     except AttributeError:

471097.87%
edsnlp/processing/multiprocessing.py

Was already missing at lines 222-230

                     return re.findall(r"/[^\s]+\.so[^\s]*", f.read())
-             except Exception:
  ...
-             return []
Was already missing at lines 233-235
         loaded = loaded_libs()
-     except Exception:
-         return False
     return any(any(k in os.path.basename(p).lower() for k in libs) for p in loaded)
Was already missing at line 254
         )
-         method = "spawn"
Was already missing at lines 258-264
     if has_hdfs and method == "fork":
-         safe = "forkserver" if "forkserver" in methods else "spawn"
  ...
-         method = safe
Was already missing at lines 454-457
                 for _ in self.iter_tasks(stage=stage, stop_mode=True):
-                     pass
-             except StopSignal:
-                 pass
             for name, queue in self.consumer_queues(stage):
Was already missing at lines 672-674
             if isinstance(docs, StreamSentinel):
-                 self.active_batches[stage].append([None, None, None, docs])
-                 continue
             batch_id = str(hash(tuple(id(x) for x in docs)))[-8:] + "-" + self.uid
Was already missing at line 1144
             if self.error:
-                 raise self.error
         finally:
Was already missing at lines 1202-1208
                 if out[0].kind == requires_sentinel:
-                     missing_sentinels -= 1
  ...
-                         missing_sentinels = len(self.cpu_worker_names)
                 continue

67124096.42%
edsnlp/processing/deprecated_pipe.py

Was already missing at lines 207-209

         def converter(doc):
-             res = results_extractor(doc)
-             return (
                 [{"note_id": doc._.note_id, **row} for row in res]

532096.23%
edsnlp/pipes/trainable/span_linker/span_linker.py

Was already missing at lines 402-404

             if self.reference_mode == "synonym":
-                 embeds = embeds.to(new_lin.weight)
-                 new_lin.weight.data = embeds
             else:

1752098.86%
edsnlp/pipes/trainable/span_classifier/span_classifier.py

Was already missing at line 380

         if not all(keep_bindings):
-             logger.warning(
                 "Some attributes have no labels or values and have been removed:"

1711099.42%
edsnlp/pipes/trainable/ner_crf/ner_crf.py

Was already missing at line 302

         if self.labels is not None and not self.infer_span_setter:
-             return
Was already missing at lines 310-312
             if callable(self.target_span_getter):
-                 for span in get_spans(doc, self.target_span_getter):
-                     inferred_labels.add(span.label_)
             else:
Was already missing at line 447
             )
-             self._has_warned = True

1774097.74%
edsnlp/pipes/trainable/layers/crf.py

Was already missing at line 80

         if learnable_transitions:
-             self.transitions = torch.nn.Parameter(
                 torch.zeros_like(forbidden_transitions, dtype=torch.float)
Was already missing at line 90
         if learnable_transitions and with_start_end_transitions:
-             self.start_transitions = torch.nn.Parameter(
                 torch.zeros(num_tags, dtype=torch.float)
Was already missing at line 99
         if learnable_transitions and with_start_end_transitions:
-             self.end_transitions = torch.nn.Parameter(
                 torch.zeros(num_tags, dtype=torch.float)

1383097.83%
edsnlp/pipes/trainable/embeddings/transformer/transformer.py

Was already missing at line 167

         if quantization is not None:
-             kwargs["quantization_config"] = quantization
Was already missing at line 192
         if self.cls_token_id is None:
-             [self.cls_token_id] = self.tokenizer.convert_tokens_to_ids(
                 [self.tokenizer.special_tokens_map["bos_token"]]
Was already missing at line 196
         if self.sep_token_id is None:
-             [self.sep_token_id] = self.tokenizer.convert_tokens_to_ids(
                 [self.tokenizer.special_tokens_map["eos_token"]]

1683098.21%
edsnlp/pipes/qualifiers/reported_speech/reported_speech.py

Was already missing at lines 24-28

         return "REPORTED"
-     elif token._.rspeech is False:
-         return "DIRECT"
-     else:
-         return None

1003097.00%
edsnlp/pipes/qualifiers/negation/negation.py

Was already missing at line 28

     else:
-         return None

1011099.01%
edsnlp/pipes/qualifiers/hypothesis/hypothesis.py

Was already missing at line 27

     else:
-         return None

981098.98%
edsnlp/pipes/qualifiers/history/history.py

Was already missing at lines 26-32

 def history_getter(token: Union[Token, Span]) -> Optional[str]:
-     if token._.history is True:
-         return "ATCD"
-     elif token._.history is False:
-         return "CURRENT"
-     else:
-         return None
Was already missing at lines 351-357
                 )
-             except ValueError:
  ...
-                 note_datetime = None
Was already missing at lines 366-372
                 )
-             except ValueError:
  ...
-                 birth_datetime = None
Was already missing at lines 438-441
                         )
-                     except ValueError as e:
-                         absolute_date = None
-                         logger.warning(
                             "In doc {}, the following date {} raises this error: {}. "

18014092.22%
edsnlp/pipes/qualifiers/family/family.py

Was already missing at line 27

     else:
-         return None

831098.80%
edsnlp/pipes/qualifiers/base.py

Was already missing at line 123

             elif on_ents_only is not True:
-                 assert span_getter is None, (
                     "Cannot use both `span_getter` and `on_ents_only` as a span "

521098.08%
edsnlp/pipes/ner/scores/sofa/sofa.py

Was already missing at line 32

             if not assigned:
-                 continue
             if assigned.get("method_max") is not None:
Was already missing at line 40
             else:
-                 method = "Non précisée"

252092.00%
edsnlp/pipes/ner/scores/elston_ellis/patterns.py

Was already missing at line 26

         if x <= 5:
-             return 1
Was already missing at lines 32-36
         else:
-             return 3
- 
-     except ValueError:
-         return None

214080.95%
edsnlp/pipes/ner/scores/charlson/patterns.py

Was already missing at lines 21-23

             return int(extracted_score)
-     except ValueError:
-         return None

132084.62%
edsnlp/pipes/ner/disorders/solid_tumor/solid_tumor.py

Was already missing at lines 131-137

         for span in spans:
-             span.label_ = "solid_tumor"
  ...
-             yield span

386084.21%
edsnlp/pipes/ner/disorders/peripheral_vascular_disease/peripheral_vascular_disease.py

Was already missing at line 108

                 if "peripheral" not in span._.assigned.keys():
-                     continue

161093.75%
edsnlp/pipes/ner/disorders/diabetes/diabetes.py

Was already missing at line 131

                 # Mostly FP
-                 continue
Was already missing at line 134
             elif self.has_far_complications(span):
-                 span._.status = 2
Was already missing at line 145
         if next(iter(self.complication_matcher(context)), None) is not None:
-             return True
         return False

303090.00%
edsnlp/pipes/ner/disorders/connective_tissue_disease/connective_tissue_disease.py

Was already missing at line 104

                 # Huge change of FP / Title section
-                 continue

151093.33%
edsnlp/pipes/ner/disorders/ckd/ckd.py

Was already missing at lines 121-124

             dfg_value = float(dfg_span.text.replace(",", ".").strip())
-         except ValueError:
-             logger.trace(f"DFG value couldn't be extracted from {dfg_span.text}")
-             return False

303090.00%
edsnlp/pipes/ner/disorders/cerebrovascular_accident/cerebrovascular_accident.py

Was already missing at lines 112-114

             if span._.source == "ischemia":
-                 if "brain" not in span._.assigned.keys():
-                     continue

182088.89%
edsnlp/pipes/ner/disorders/base.py

Was already missing at lines 119-122

             if span._.status is not None and span._.status not in all_detailed_status:
-                 default_status = 1 if 1 in all_detailed_status else None
  ...
-                 span._.status = default_status
             span._.detailed_status = self.detailed_status_mapping.get(

312093.55%
edsnlp/pipes/ner/adicap/models.py

Was already missing at line 15

     def norm(self) -> str:
-         return self.code
Was already missing at line 18
     def __str__(self):
-         return self.norm()

142085.71%
edsnlp/pipes/misc/split/split.py

Was already missing at lines 186-188

         if max_length <= 0 and self.regex is None:
-             yield doc
-             return

742097.30%
edsnlp/pipes/misc/sections/sections.py

Was already missing at line 126

         if sections is None:
-             sections = patterns.sections
         sections = dict(sections)

461097.83%
edsnlp/pipes/misc/quantities/quantities.py

Was already missing at lines 195-197

     def __getitem__(self, item: int):
-         assert isinstance(item, int)
-         return [self][item]
Was already missing at lines 209-215
     def __eq__(self, other: Any):
-         if isinstance(other, SimpleQuantity):
  ...
-         return False
Was already missing at line 218
         if other.unit == self.unit:
-             return SimpleQuantity(
                 self.value + other.value,
Was already missing at line 272
     def verify(cls, ent):
-         return True
Was already missing at line 338
     def __lt__(self, other: Union[SimpleQuantity, "RangeQuantity"]):
-         return max(self.convert_to(other.unit)) < min((part.value for part in other))
Was already missing at line 361
             return self.convert_to(other.unit) == other.value
-         return False
Was already missing at line 375
     def verify(cls, ent):
-         return True
Was already missing at line 1357
         if snippet.end != last and doclike.doc[last : snippet.end].text.strip() == "":
-             pseudo.append("w")
         pseudo = "".join(pseudo)
Was already missing at lines 1738-1742
                         ):
-                             unitless_pattern = self.unitless_patterns[
  ...
-                             unit_norm = next(
                                 scope["unit"]
Was already missing at line 1783
             ):
-                 ent = doc[min(ent_start, unit_text.start) : number.end]
             else:

70214098.01%
edsnlp/pipes/misc/dates/models.py

Was already missing at line 157

                     else:
-                         d["month"] = note_datetime.month
                 if self.day is None:
Was already missing at lines 161-167
             else:
-                 if self.year is None:
  ...
-                     d["day"] = default_day
Was already missing at lines 175-177
                 return dt
-             except ValueError:
-                 return None
Was already missing at line 193
         else:
-             return None
Was already missing at line 209
         if self.second:
-             norm += f"{self.second:02}s"

20311094.58%
edsnlp/pipes/misc/dates/dates.py

Was already missing at line 249

         if isinstance(absolute, str):
-             absolute = [absolute]
         if isinstance(relative, str):
Was already missing at line 251
         if isinstance(relative, str):
-             relative = [relative]
         if isinstance(duration, str):
Was already missing at line 253
         if isinstance(duration, str):
-             relative = [duration]
         if isinstance(false_positive, str):
Was already missing at lines 357-366
             if self.merge_mode == "align":
-                 alignments = align_spans(matches, spans, sort_by_overlap=True)
  ...
-                         matches.append(span)
Was already missing at lines 462-464
                 if v1.mode == Mode.DURATION:
-                     m1 = Bound.FROM if v2.bound == Bound.UNTIL else Bound.UNTIL
-                     m2 = v2.mode or Bound.FROM
                 elif v2.mode == Mode.DURATION:

15314090.85%
edsnlp/pipes/misc/consultation_dates/consultation_dates.py

Was already missing at line 131

         else:
-             self.date_matcher = None
Was already missing at line 134
         if not consultation_mention:
-             consultation_mention = []
         elif consultation_mention is True:

482095.83%
edsnlp/pipes/llm/llm_span_qualifier/llm_span_qualifier.py

Was already missing at line 579

         if isinstance(formatted_context, Doc):
-             context_text = formatted_context.text
         else:
Was already missing at line 657
             if start == -1 or end <= 0 or end <= start:
-                 return None
             try:
Was already missing at line 750
             if next_yield >= len(doc_states):
-                 return
             for state in doc_states[next_yield:]:

2493098.80%
edsnlp/pipes/llm/llm_markup_extractor/llm_markup_extractor.py

Was already missing at line 309

         if seed is not None:
-             api_kwargs["seed"] = seed
         self.retriever = None
Was already missing at line 355
             if span is None:
-                 continue
             spans.append(span)
Was already missing at lines 467-469
                 if not contexts:
-                     remaining_ctx_counts[doc_idx] = 0
-                     buffer[doc_idx] = doc
                 else:
Was already missing at line 490
             if result is None:
-                 pass
             else:

1575096.82%
edsnlp/pipes/core/normalizer/__init__.py

Was already missing at line 7

 def excluded_or_space_getter(t):
-     return t.is_space or t.tag_ == "EXCLUDED"

51080.00%
edsnlp/pipes/core/endlines/endlines.py

Was already missing at lines 160-164

         if end_lines_model is None:
-             path = build_path(__file__, "base_model.pkl")
- 
-             with open(path, "rb") as inp:
-                 self.model = pickle.load(inp)
         elif isinstance(end_lines_model, str):
Was already missing at lines 167-169
                 self.model = pickle.load(inp)
-         elif isinstance(end_lines_model, EndLinesModel):
-             self.model = end_lines_model
         else:
Was already missing at line 200
         ):
-             return "ENUMERATION"
Was already missing at line 287
         if np.isnan(sigma):
-             sigma = 1

897092.13%
edsnlp/pipes/core/contextual_matcher/contextual_matcher.py

Was already missing at lines 242-244

             ):
-                 to_keep = False
-                 break

1302098.46%
edsnlp/patch_spacy.py

Was already missing at lines 67-69

             # if module is reloaded.
-             existing_func = registry.factories.get(internal_name)
-             if not util.is_same_func(factory_func, existing_func):
                 raise ValueError(

312093.55%
edsnlp/package.py

Was already missing at lines 474-476

             version = version or pyproject["project"]["version"]
-         except (KeyError, TypeError):
-             version = "0.1.0"
         name = name or pyproject["project"]["name"]
Was already missing at line 480
         else:
-             main_package = None
         model_package = snake_case(name.lower())

2143098.60%
edsnlp/metrics/span_attribute.py

Was already missing at lines 68-70

         )
-         assert attributes is None
-         attributes = kwargs.pop("qualifiers")
     if attributes is None:

932097.85%
edsnlp/matchers/simstring.py

Was already missing at line 280

     if custom:
-         attr = attr[1:].lower()
Was already missing at line 295
             if custom:
-                 token_text = getattr(token._, attr)
             else:

1462098.63%
edsnlp/language.py

Was already missing at line 103

             if last != begin:
-                 logger.warning(
                     "Missed some characters during"

521098.08%
edsnlp/data/standoff.py

Was already missing at line 38

     def __init__(self, ann_file, line):
-         super().__init__(f"File {ann_file}, unrecognized Brat line {line}")
Was already missing at line 192
                         )
-                 except Exception:
                     raise Exception(

1862098.92%
edsnlp/data/polars.py

Was already missing at line 36

         if hasattr(data, "collect"):
-             data = data.collect()
         assert isinstance(data, pl.DataFrame)

551098.18%
edsnlp/data/json.py

Was already missing at line 81

                 return records
-         except Exception as e:
             raise Exception(f"Cannot read {file}: {e}")

1121099.11%
edsnlp/data/huggingface_dataset.py

Was already missing at line 259

         if isinstance(item, DatasetEndSentinel):
-             continue
         else:

991098.99%
edsnlp/data/converters.py

Was already missing at line 427

                 elif key == "XPOS":
-                     word.tag_ = value
                 elif key == "FEATS":
Was already missing at line 835
         if self.keep_raw_attribute_values:
-             return value
         try:
Was already missing at line 897
                 if not attr:
-                     continue
                 if "=" in attr:
Was already missing at line 928
             if span is None:
-                 continue
             for k, v in attrs.items():
Was already missing at line 998
         if isinstance(value, (bool, int, float)):
-             return repr(value)
         s = str(value)
Was already missing at line 1307
                 if current_type is not None:
-                     entities.append((start_idx, i, current_type))
                 start_idx = i
Was already missing at line 1401
             if start < 0 or start >= len(tags):
-                 continue
             tags[start] = f"B-{label}"
Was already missing at line 1445
     if isinstance(converter, type):
-         return converter(**kwargs), {}
     return converter, validate_kwargs(converter, kwargs)

5008098.40%
edsnlp/data/conll.py

Was already missing at lines 81-83

             )
-         except StopIteration:
-             cols = DEFAULT_COLUMNS
             warnings.warn(
Was already missing at lines 92-96
         if not line:
-             if doc["words"]:
-                 yield doc
-                 doc = {"words": []}
-             continue
         if line.startswith("#"):

766092.11%
edsnlp/core/torch_component.py

Was already missing at line 407

             if hasattr(self, "compiled"):
-                 res = self.compiled(batch)
             else:
Was already missing at line 453
         """
-         return self.preprocess(doc)

1902098.95%
edsnlp/core/stream.py

Was already missing at line 155

             else:
-                 yield res
             return
Was already missing at lines 203-205
                 if isinstance(batch, StreamSentinel):
-                     yield batch
-                     continue
                 results = []
Was already missing at lines 1030-1032
                 elif op.batch_fn is None:
-                     batch_size = op.size
-                     batch_fn = batchify
                 else:

3835098.69%
edsnlp/core/registries.py

Was already missing at line 138

         if isinstance(obj, DraftPipe):
-             return obj
         elif isinstance(obj, dict):
Was already missing at line 143
                 if result is not None:
-                     return result
         elif isinstance(obj, (tuple, list, set)):
Was already missing at line 148
                 if result is not None:
-                     return result
         return None

2213098.64%
edsnlp/core/pipeline.py

Was already missing at line 607

             if name in exclude:
-                 continue
             if name not in components:
Was already missing at lines 715-718
         """
-         res = Stream.ensure_stream(docs)
-         res = res.map(functools.partial(self.preprocess, supervision=supervision))
-         return res

4634099.14%
edsnlp/connectors/omop.py

Was already missing at line 69

         if not isinstance(row.ents, list):
-             continue
Was already missing at line 87
             else:
-                 doc.spans[span.label_].append(span)
Was already missing at line 127
     if df.note_id.isna().any():
-         df["note_id"] = range(len(df))
Was already missing at line 171
         if i > 0:
-             df.term_modifiers += ";"
         df.term_modifiers += ext + "=" + df[ext].astype(str)

844095.24%
edsnlp/_version.py

Was already missing at line 21

     if repo_root is None:
-         return base_version
Was already missing at line 39
-     return (
         base_version

152086.67%
edsnlp/tune.py

Was already missing at line 221

         return logger_config.get("@loggers") == "json"
-     return False
Was already missing at line 291
                 continue
-             raise
         for feature, importance in importance_scores.items():
Was already missing at line 399
             ):
-                 resolved_key = int(key)
             try:
Was already missing at lines 650-652
             return os.path.join(info.root_dir, metrics_relpath)
-         time.sleep(1)
-     return None
Was already missing at line 834
             if was_pruned:
-                 _handle_pruned_dvc_runs(queue, entries, entry)
             if not (result and result.exp_hash and result.ref_info):
Was already missing at line 1062
         else:
-             config = copy.deepcopy(raw_config)
         updated_config = update_config(
Was already missing at line 1197
     else:
-         config_path_phase_2 = os.path.join(output_dir_phase_1, "config.cfg")

6268-298.72%

278 files skipped due to complete coverage.

Coverage failure: total of 98.03% is less than 98.07% ❌

@sonarqubecloud
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 14, 2025

Docs preview URL

https://edsnlp-tnm-new-regex.vercel.app

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Jul 4, 2025

@percevalw percevalw force-pushed the master branch 2 times, most recently from d2e1f39 to 65669dc Compare September 4, 2025 07:26
@LucasDedieu LucasDedieu closed this Jan 8, 2026
@LucasDedieu LucasDedieu reopened this Feb 10, 2026
@sonarqubecloud
Copy link
Copy Markdown

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant