11661 lines
412 KiB
Plaintext
11661 lines
412 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# **Analyse 2**\n",
|
||
"\n",
|
||
"## Strategie & Fokus\n",
|
||
"\n",
|
||
"- Konzentration auf Export 4/5 (größste Datensätze)\n",
|
||
" - jeder Datensatz gehört zu unterschiedlichem Kunden\n",
|
||
" - dadurch: Abweichungen zwischen IDs und assoziierten Beschreibungen; OBjektID mehrfach vergeben\n",
|
||
"\n",
|
||
"### Merkmal 1 - Vorgansgbeschreibungen:\n",
|
||
"\n",
|
||
"- Analyse hinsichtlich möglicher Cluster in ``VorgangsBeschreibung``:\n",
|
||
" - evtl. Ableitung standardisierter, auswählbarer Beschreibungen\n",
|
||
" - typische Begriffe und wiederholendes Auftreten\n",
|
||
"- Zusatzinformation über ``VorgangsArtText``:\n",
|
||
" - teilweise standardisiert\n",
|
||
" - *Verbindung zu ``VorgangsBeschreibung`` semantisch korrekt?*\n",
|
||
"- Zusatzinformation ``VorgangsTypName`` mit ``VorgangsTypID``:\n",
|
||
" - definitiv standardisiert\n",
|
||
" - *Anzahl einzigartiger Typen?*\n",
|
||
"\n",
|
||
"### Merkmal 2 - Zeitbezüge innerhalb der Vorgänge\n",
|
||
"\n",
|
||
"- *Identifikation von Objekten, die häufig vertreten sind*\n",
|
||
"- *Untersuchung der Zeitabstände zwischen Erstellung, Planung, Erledigung:*\n",
|
||
" - Erstellung: ``ErstellungsDatum``\n",
|
||
" - Planung: ``VorgangsDatum``\n",
|
||
" - Erledigung: ``ErledigungsDatum``\n",
|
||
"- *Abstände zwischen zwei ähnlichen Fehlerbildern jedes Objekts oder den Objekte, die am häufigsten vertreten sind*"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"---\n",
|
||
"\n",
|
||
"# Merkmal 1: Clustering von Vorgangsbeschreibungen\n",
|
||
"\n",
|
||
"## Recherche\n",
|
||
"[Textmining HS Hannover](https://textmining.wp.hs-hannover.de/Preprocessing.html)\n",
|
||
"\n",
|
||
"### Allgemeine Zergliederung der Einzelbeschreibungen\n",
|
||
"\n",
|
||
"- Text in Sätze\n",
|
||
"- Sätze in Wörter\n",
|
||
"- Wörter in Grundform:\n",
|
||
" - Lemma: Die Form des Wortes, wie sie in einem Wörterbuch steht. Z.B.: Haus, laufen, begründen\n",
|
||
" - Stamm: Das Wort ohne Flexionsendungen (Prefixe und Suffixe). Z.B.: Haus, lauf, begründ\n",
|
||
" - Wurzel: Kern des Wortes, von dem das Wort ggf. durch Derivation abgeleitet wurde. Z.B.: Haus, lauf, Grund\n",
|
||
"- Wortartbestimmung\n",
|
||
" - klassische Part-of-Speech-Erkennung (herkömmliche Wortart)\n",
|
||
" - Named Entity Recognition (NER) (Eigennamen)\n",
|
||
" - Bsp. spaCy: Person, Ort, Organisation, Verschiedenes\n",
|
||
"\n",
|
||
"#### Semantik\n",
|
||
"\n",
|
||
"- Wörter innerhalb eines Satzes größere Zusammenhänge als außerhalb\n",
|
||
"\n",
|
||
"### Pakete\n",
|
||
"\n",
|
||
"- Englisch: \n",
|
||
" - [NLTK](https://www.nltk.org/)\n",
|
||
"- Deutsch:\n",
|
||
" - [HanTa - The Hanover Tagger](https://github.com/wartaal/HanTa/tree/master)\n",
|
||
" - [TreeTagger](https://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/)\n",
|
||
" - [Python Wrapper](https://treetaggerwrapper.readthedocs.io/en/latest/)\n",
|
||
" - [spaCy](https://spacy.io/)\n",
|
||
" - [Beispiel 1](https://www.trinnovative.de/blog/2020-09-08-natural-language-processing-mit-spacy.html)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Analyse"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import numpy as np\n",
|
||
"import pandas as pd\n",
|
||
"import spacy\n",
|
||
"from collections import Counter\n",
|
||
"from itertools import combinations\n",
|
||
"from dateutil.parser import parse\n",
|
||
"import re\n",
|
||
"from spellchecker import SpellChecker\n",
|
||
"\n",
|
||
"import matplotlib.pyplot as plt\n",
|
||
"import seaborn as sns\n",
|
||
"\n",
|
||
"import logging\n",
|
||
"import sys\n",
|
||
"import pickle\n",
|
||
"\n",
|
||
"LOGGING_LEVEL = 'INFO'\n",
|
||
"logging.basicConfig(level=LOGGING_LEVEL, stream=sys.stdout)\n",
|
||
"logger = logging.getLogger('base')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"def save_pickle(obj, path):\n",
|
||
" with open(path, 'wb') as file:\n",
|
||
" pickle.dump(obj, file, protocol=pickle.HIGHEST_PROTOCOL)\n",
|
||
" \n",
|
||
"def load_pickle(path):\n",
|
||
" with open(path, 'rb') as file:\n",
|
||
" obj = pickle.load(file)\n",
|
||
" return obj"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"sns.set()\n",
|
||
"LOAD_CALC_FILES = False\n",
|
||
"\n",
|
||
"DESC_BLACKLIST = set(['-'])\n",
|
||
"\"\"\"\n",
|
||
"GENERAL_BLACKLIST = set([\n",
|
||
" 'herr', 'hr.', 'förster', 'graf', 'stöppel', \n",
|
||
" 'stab', 'kw', 'h.', 'koch', 'heininger', '.',\n",
|
||
" 'schwab', 'm.', 'wenninger', '-', '--',\n",
|
||
"])\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"GENERAL_BLACKLIST = set([\n",
|
||
" 'herr', 'hr.' 'kw', 'h.', '.',\n",
|
||
" 'm.', '-', '--', 'dr.', 'dr',\n",
|
||
"])\n",
|
||
"\n",
|
||
"#GENERAL_BLACKLIST = set()\n",
|
||
"#POS_of_interest = set(['NOUN', 'PROPN', 'ADJ', 'VERB', 'AUX'])\n",
|
||
"POS_of_interest = set(['NOUN', 'ADJ', 'VERB', 'AUX'])\n",
|
||
"TAG_of_interest = set(['ADJD'])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# load language model\n",
|
||
"nlp = spacy.load('de_dep_news_trf')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"<class 'pandas.core.frame.DataFrame'>\n",
|
||
"RangeIndex: 129020 entries, 0 to 129019\n",
|
||
"Data columns (total 20 columns):\n",
|
||
" # Column Non-Null Count Dtype \n",
|
||
"--- ------ -------------- ----- \n",
|
||
" 0 VorgangsID 129020 non-null int64 \n",
|
||
" 1 ObjektID 129020 non-null int64 \n",
|
||
" 2 HObjektText 129003 non-null object \n",
|
||
" 3 ObjektArtID 129020 non-null int64 \n",
|
||
" 4 ObjektArtText 128372 non-null object \n",
|
||
" 5 VorgangsTypID 129020 non-null int64 \n",
|
||
" 6 VorgangsTypName 129020 non-null object \n",
|
||
" 7 VorgangsDatum 129020 non-null datetime64[ns]\n",
|
||
" 8 VorgangsStatusId 129020 non-null int64 \n",
|
||
" 9 VorgangsPrioritaet 129020 non-null int64 \n",
|
||
" 10 VorgangsBeschreibung 124087 non-null object \n",
|
||
" 11 VorgangsOrt 507 non-null object \n",
|
||
" 12 VorgangsArtText 129020 non-null object \n",
|
||
" 13 ErledigungsDatum 129020 non-null datetime64[ns]\n",
|
||
" 14 ErledigungsArtText 128474 non-null object \n",
|
||
" 15 ErledigungsBeschreibung 118135 non-null object \n",
|
||
" 16 MPMelderArbeitsplatz 6359 non-null object \n",
|
||
" 17 MPAbteilungBezeichnung 6359 non-null object \n",
|
||
" 18 Arbeitsbeginn 123538 non-null datetime64[ns]\n",
|
||
" 19 ErstellungsDatum 129020 non-null datetime64[ns]\n",
|
||
"dtypes: datetime64[ns](4), int64(6), object(10)\n",
|
||
"memory usage: 19.7+ MB\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# load dataset\n",
|
||
"FILE_PATH = '01_2_Rohdaten_neu/Export4.csv'\n",
|
||
"date_cols = ['VorgangsDatum', 'ErledigungsDatum', 'Arbeitsbeginn', 'ErstellungsDatum']\n",
|
||
"raw = pd.read_csv(filepath_or_buffer=FILE_PATH, sep=';', encoding='cp1252', parse_dates=date_cols, dayfirst=True)\n",
|
||
"raw.info()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>VorgangsID</th>\n",
|
||
" <th>ObjektID</th>\n",
|
||
" <th>HObjektText</th>\n",
|
||
" <th>ObjektArtID</th>\n",
|
||
" <th>ObjektArtText</th>\n",
|
||
" <th>VorgangsTypID</th>\n",
|
||
" <th>VorgangsTypName</th>\n",
|
||
" <th>VorgangsDatum</th>\n",
|
||
" <th>VorgangsStatusId</th>\n",
|
||
" <th>VorgangsPrioritaet</th>\n",
|
||
" <th>VorgangsBeschreibung</th>\n",
|
||
" <th>VorgangsOrt</th>\n",
|
||
" <th>VorgangsArtText</th>\n",
|
||
" <th>ErledigungsDatum</th>\n",
|
||
" <th>ErledigungsArtText</th>\n",
|
||
" <th>ErledigungsBeschreibung</th>\n",
|
||
" <th>MPMelderArbeitsplatz</th>\n",
|
||
" <th>MPAbteilungBezeichnung</th>\n",
|
||
" <th>Arbeitsbeginn</th>\n",
|
||
" <th>ErstellungsDatum</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>11</td>\n",
|
||
" <td>114</td>\n",
|
||
" <td>427 C , Webmaschine, DL 280 EMS Breite 280</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Luft-Webmaschine</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-03-06</td>\n",
|
||
" <td>4</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Kettbaum kaputt</td>\n",
|
||
" <td>2019-03-06</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>NaT</td>\n",
|
||
" <td>2019-03-06</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>17</td>\n",
|
||
" <td>124</td>\n",
|
||
" <td>621 C , Webmaschine, DL 280 EMS Breite 280</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Luft-Webmaschine</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-03-11</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>asgasdg</td>\n",
|
||
" <td>2019-03-11</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Elektrowerkstatt</td>\n",
|
||
" <td>Elektrowerkstatt</td>\n",
|
||
" <td>NaT</td>\n",
|
||
" <td>2019-03-11</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>53</td>\n",
|
||
" <td>244</td>\n",
|
||
" <td>285 C, Webmaschine, SG 220 EMS</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>Greifer-Webmaschine</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-03-19</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Kupplung schleift</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Kupplung defekt</td>\n",
|
||
" <td>2019-03-20</td>\n",
|
||
" <td>Reparatur UTT</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>NaT</td>\n",
|
||
" <td>2019-03-19</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>58</td>\n",
|
||
" <td>257</td>\n",
|
||
" <td>107, Webmaschine, OM 220 EOS</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Luft-Webmaschine</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-03-21</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Gegengewicht wieder anbringen</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Gegengewicht an der Webmaschine abgefallen</td>\n",
|
||
" <td>2019-03-21</td>\n",
|
||
" <td>Reparatur UTT</td>\n",
|
||
" <td>Schraube ausgebohrt\\nGegengewicht wieder angeb...</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>2019-03-21</td>\n",
|
||
" <td>2019-03-21</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>81</td>\n",
|
||
" <td>138</td>\n",
|
||
" <td>00138, Schärmaschine 9,</td>\n",
|
||
" <td>16</td>\n",
|
||
" <td>Schärmaschine</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>da ist etwas gebrochen. (Herr Heininger)</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>zentrale Bremsenverstellung linke Gatterseite ...</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>Reparatur UTT</td>\n",
|
||
" <td>Bolzen gebrochen. Bolzen neu angefertig und di...</td>\n",
|
||
" <td>Vorwerk</td>\n",
|
||
" <td>Vorwerk</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" VorgangsID ObjektID HObjektText \\\n",
|
||
"0 11 114 427 C , Webmaschine, DL 280 EMS Breite 280 \n",
|
||
"1 17 124 621 C , Webmaschine, DL 280 EMS Breite 280 \n",
|
||
"2 53 244 285 C, Webmaschine, SG 220 EMS \n",
|
||
"3 58 257 107, Webmaschine, OM 220 EOS \n",
|
||
"4 81 138 00138, Schärmaschine 9, \n",
|
||
"\n",
|
||
" ObjektArtID ObjektArtText VorgangsTypID VorgangsTypName \\\n",
|
||
"0 3 Luft-Webmaschine 3 Reparaturauftrag (Portal) \n",
|
||
"1 3 Luft-Webmaschine 3 Reparaturauftrag (Portal) \n",
|
||
"2 5 Greifer-Webmaschine 3 Reparaturauftrag (Portal) \n",
|
||
"3 3 Luft-Webmaschine 3 Reparaturauftrag (Portal) \n",
|
||
"4 16 Schärmaschine 3 Reparaturauftrag (Portal) \n",
|
||
"\n",
|
||
" VorgangsDatum VorgangsStatusId VorgangsPrioritaet \\\n",
|
||
"0 2019-03-06 4 0 \n",
|
||
"1 2019-03-11 5 0 \n",
|
||
"2 2019-03-19 5 0 \n",
|
||
"3 2019-03-21 5 0 \n",
|
||
"4 2019-03-25 5 0 \n",
|
||
"\n",
|
||
" VorgangsBeschreibung VorgangsOrt \\\n",
|
||
"0 NaN NaN \n",
|
||
"1 NaN NaN \n",
|
||
"2 Kupplung schleift NaN \n",
|
||
"3 Gegengewicht wieder anbringen NaN \n",
|
||
"4 da ist etwas gebrochen. (Herr Heininger) NaN \n",
|
||
"\n",
|
||
" VorgangsArtText ErledigungsDatum \\\n",
|
||
"0 Kettbaum kaputt 2019-03-06 \n",
|
||
"1 asgasdg 2019-03-11 \n",
|
||
"2 Kupplung defekt 2019-03-20 \n",
|
||
"3 Gegengewicht an der Webmaschine abgefallen 2019-03-21 \n",
|
||
"4 zentrale Bremsenverstellung linke Gatterseite ... 2019-03-25 \n",
|
||
"\n",
|
||
" ErledigungsArtText ErledigungsBeschreibung \\\n",
|
||
"0 NaN NaN \n",
|
||
"1 NaN NaN \n",
|
||
"2 Reparatur UTT NaN \n",
|
||
"3 Reparatur UTT Schraube ausgebohrt\\nGegengewicht wieder angeb... \n",
|
||
"4 Reparatur UTT Bolzen gebrochen. Bolzen neu angefertig und di... \n",
|
||
"\n",
|
||
" MPMelderArbeitsplatz MPAbteilungBezeichnung Arbeitsbeginn ErstellungsDatum \n",
|
||
"0 Weberei Weberei NaT 2019-03-06 \n",
|
||
"1 Elektrowerkstatt Elektrowerkstatt NaT 2019-03-11 \n",
|
||
"2 Weberei Weberei NaT 2019-03-19 \n",
|
||
"3 Weberei Weberei 2019-03-21 2019-03-21 \n",
|
||
"4 Vorwerk Vorwerk 2019-03-25 2019-03-25 "
|
||
]
|
||
},
|
||
"execution_count": 6,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"raw.head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Anzahl Features: 20\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(f\"Anzahl Features: {len(raw.columns)}\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**Neue Features gegenüber letzter Analyse:**\n",
|
||
"- ``ObjektArtID``\n",
|
||
"- ``ObjektArtText``\n",
|
||
"- ``VorgangsTypName``"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"### Duplikate"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"duplicates_filt = raw.duplicated()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Anzahl Duplikate: 84\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(f\"Anzahl Duplikate: {duplicates_filt.sum()}\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"filt_data = raw[duplicates_filt]\n",
|
||
"uni_obj_id_dupl = filt_data['ObjektID'].unique()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 11,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Anzahl einzigartiger Objekt-IDs unter Duplikaten: 47\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(f\"Anzahl einzigartiger Objekt-IDs unter Duplikaten: {len(uni_obj_id_dupl)}\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 12,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"<class 'pandas.core.frame.DataFrame'>\n",
|
||
"RangeIndex: 128936 entries, 0 to 128935\n",
|
||
"Data columns (total 20 columns):\n",
|
||
" # Column Non-Null Count Dtype \n",
|
||
"--- ------ -------------- ----- \n",
|
||
" 0 VorgangsID 128936 non-null int64 \n",
|
||
" 1 ObjektID 128936 non-null int64 \n",
|
||
" 2 HObjektText 128920 non-null object \n",
|
||
" 3 ObjektArtID 128936 non-null int64 \n",
|
||
" 4 ObjektArtText 128289 non-null object \n",
|
||
" 5 VorgangsTypID 128936 non-null int64 \n",
|
||
" 6 VorgangsTypName 128936 non-null object \n",
|
||
" 7 VorgangsDatum 128936 non-null datetime64[ns]\n",
|
||
" 8 VorgangsStatusId 128936 non-null int64 \n",
|
||
" 9 VorgangsPrioritaet 128936 non-null int64 \n",
|
||
" 10 VorgangsBeschreibung 124008 non-null object \n",
|
||
" 11 VorgangsOrt 507 non-null object \n",
|
||
" 12 VorgangsArtText 128936 non-null object \n",
|
||
" 13 ErledigungsDatum 128936 non-null datetime64[ns]\n",
|
||
" 14 ErledigungsArtText 128402 non-null object \n",
|
||
" 15 ErledigungsBeschreibung 118086 non-null object \n",
|
||
" 16 MPMelderArbeitsplatz 6337 non-null object \n",
|
||
" 17 MPAbteilungBezeichnung 6337 non-null object \n",
|
||
" 18 Arbeitsbeginn 123480 non-null datetime64[ns]\n",
|
||
" 19 ErstellungsDatum 128936 non-null datetime64[ns]\n",
|
||
"dtypes: datetime64[ns](4), int64(6), object(10)\n",
|
||
"memory usage: 19.7+ MB\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"wo_duplicates = raw.drop_duplicates(ignore_index=True)\n",
|
||
"wo_duplicates.info()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"### ``VorgangsBeschreibung``"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### **NA vals und Duplikate**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"String-Bereinigung"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 13,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"SPECIAL_CHARS = set(['&', '$', '%', '§', '/', '(', ')', '_', \n",
|
||
" '+', '–', '--', '<', '>', '´',\n",
|
||
"])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 14,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"def clean_string(string: str) -> str:\n",
|
||
" #num_reps = 5\n",
|
||
" \n",
|
||
" # remove special chars\n",
|
||
" pattern = r'[\\t\\n\\r\\f\\v]'\n",
|
||
" string = re.sub(pattern, ' ', string)\n",
|
||
" # remove dates\n",
|
||
" pattern = r'[\\d]{1,4}[.:][\\d]{1,4}[.:][\\d]{1,4}'\n",
|
||
" string = re.sub(pattern, '', string)\n",
|
||
" # remove times\n",
|
||
" pattern = r'[\\d]{1,2}[:][\\d]{1,2}[:][\\d]{0,2}'\n",
|
||
" string = re.sub(pattern, '', string)\n",
|
||
" # remove all chars despite punctuation and alphanumeric ones\n",
|
||
" pattern = r'[^ \\w.,;:\\-äöüÄÖÜ]+'\n",
|
||
" string = re.sub(pattern, '', string)\n",
|
||
" # remove - where it is used as em dash\n",
|
||
" pattern = r'[\\W]+-[\\W]+'\n",
|
||
" string = re.sub(pattern, ' ', string)\n",
|
||
" # remove whitespaces in front of punctuation\n",
|
||
" pattern = r'[ ]+([;,.:])'\n",
|
||
" string = re.sub(pattern, r'\\1', string)\n",
|
||
" # remove multiple whitespaces\n",
|
||
" pattern = r'[ ]+'\n",
|
||
" string = re.sub(pattern, ' ', string)\n",
|
||
" # remove whitespaces at the beginning and the end\n",
|
||
" string = string.strip()\n",
|
||
" \n",
|
||
" #while num_reps != 0:\n",
|
||
" #string = string.replace('\\n', ' ')\n",
|
||
" #string = string.replace('\\t', ' ')\n",
|
||
" #string = string.replace(' ', ' ')\n",
|
||
" #string = string.replace(' ', ' ')\n",
|
||
" #string = string.replace(' - ', ' ')\n",
|
||
" \"\"\"\n",
|
||
" for char in SPECIAL_CHARS:\n",
|
||
" string = string.replace(char, '')\n",
|
||
" \n",
|
||
" #num_reps -= 1\n",
|
||
" \n",
|
||
" # remove spaces at the beginning and the end\n",
|
||
" string = string.strip()\n",
|
||
" \"\"\"\n",
|
||
" \n",
|
||
" return string"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 15,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"base = wo_duplicates.copy()\n",
|
||
"base = base.dropna(axis=0, subset='VorgangsBeschreibung')\n",
|
||
"base['VorgangsBeschreibung'] = base['VorgangsBeschreibung'].map(clean_string)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 16,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>VorgangsID</th>\n",
|
||
" <th>ObjektID</th>\n",
|
||
" <th>HObjektText</th>\n",
|
||
" <th>ObjektArtID</th>\n",
|
||
" <th>ObjektArtText</th>\n",
|
||
" <th>VorgangsTypID</th>\n",
|
||
" <th>VorgangsTypName</th>\n",
|
||
" <th>VorgangsDatum</th>\n",
|
||
" <th>VorgangsStatusId</th>\n",
|
||
" <th>VorgangsPrioritaet</th>\n",
|
||
" <th>VorgangsBeschreibung</th>\n",
|
||
" <th>VorgangsOrt</th>\n",
|
||
" <th>VorgangsArtText</th>\n",
|
||
" <th>ErledigungsDatum</th>\n",
|
||
" <th>ErledigungsArtText</th>\n",
|
||
" <th>ErledigungsBeschreibung</th>\n",
|
||
" <th>MPMelderArbeitsplatz</th>\n",
|
||
" <th>MPAbteilungBezeichnung</th>\n",
|
||
" <th>Arbeitsbeginn</th>\n",
|
||
" <th>ErstellungsDatum</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>53</td>\n",
|
||
" <td>244</td>\n",
|
||
" <td>285 C, Webmaschine, SG 220 EMS</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>Greifer-Webmaschine</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-03-19</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Kupplung schleift</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Kupplung defekt</td>\n",
|
||
" <td>2019-03-20</td>\n",
|
||
" <td>Reparatur UTT</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>NaT</td>\n",
|
||
" <td>2019-03-19</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>58</td>\n",
|
||
" <td>257</td>\n",
|
||
" <td>107, Webmaschine, OM 220 EOS</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Luft-Webmaschine</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-03-21</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Gegengewicht wieder anbringen</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Gegengewicht an der Webmaschine abgefallen</td>\n",
|
||
" <td>2019-03-21</td>\n",
|
||
" <td>Reparatur UTT</td>\n",
|
||
" <td>Schraube ausgebohrt\\nGegengewicht wieder angeb...</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>2019-03-21</td>\n",
|
||
" <td>2019-03-21</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>81</td>\n",
|
||
" <td>138</td>\n",
|
||
" <td>00138, Schärmaschine 9,</td>\n",
|
||
" <td>16</td>\n",
|
||
" <td>Schärmaschine</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>da ist etwas gebrochen. Herr Heininger</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>zentrale Bremsenverstellung linke Gatterseite ...</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>Reparatur UTT</td>\n",
|
||
" <td>Bolzen gebrochen. Bolzen neu angefertig und di...</td>\n",
|
||
" <td>Vorwerk</td>\n",
|
||
" <td>Vorwerk</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>82</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Warenschau allgemein</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Klappbügel Portalkran H31 defekt</td>\n",
|
||
" <td>Warenschau allgemein</td>\n",
|
||
" <td>Allgemeine Reparaturarbeiten</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>Reparatur UTT</td>\n",
|
||
" <td>Feder ausgetauscht</td>\n",
|
||
" <td>Warenschau</td>\n",
|
||
" <td>Warenschau</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>76</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Neben der Türe</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-03-22</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Schraube nix mer gut</td>\n",
|
||
" <td>Neben der Türe</td>\n",
|
||
" <td>Kettbaum</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>Reparatur UTT</td>\n",
|
||
" <td>Schrauben ausgebohrt\\t\\nGewinde nachgeschnitten\\t</td>\n",
|
||
" <td>Vorwerk</td>\n",
|
||
" <td>Vorwerk</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>2019-03-22</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>128931</th>\n",
|
||
" <td>518956</td>\n",
|
||
" <td>1708</td>\n",
|
||
" <td>01708, Betriebsfahrräder Schlosserei,</td>\n",
|
||
" <td>57</td>\n",
|
||
" <td>Interne Wartungsobjekte</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2023-06-19</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>2-wöchige Reinigung Sichtkontrolle Technische ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>02 Interne Reinigung / Pflege / Überprüfung</td>\n",
|
||
" <td>2023-06-19</td>\n",
|
||
" <td>Intern UTT - Prüfung</td>\n",
|
||
" <td>Reinigung & Sichtkontrolle (Technische Einric...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2023-06-19</td>\n",
|
||
" <td>2023-03-14</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>128932</th>\n",
|
||
" <td>275123</td>\n",
|
||
" <td>1654</td>\n",
|
||
" <td>WEBEREI ALLGEMEIN, Weberei allgemein,</td>\n",
|
||
" <td>90</td>\n",
|
||
" <td>UTT allgemein</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2022-09-29</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Adapter entfernen und Gewinde nachschneiden.</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Kettbaum-Adapter</td>\n",
|
||
" <td>2022-09-30</td>\n",
|
||
" <td>Intern UTT - Reparatur</td>\n",
|
||
" <td>mit schlosserei aufräumen</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>2022-09-30</td>\n",
|
||
" <td>2022-09-29</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>128933</th>\n",
|
||
" <td>275125</td>\n",
|
||
" <td>1795</td>\n",
|
||
" <td>A054.S, Jacquardmaschine,</td>\n",
|
||
" <td>24</td>\n",
|
||
" <td>Stäubli-Jacquardmaschine</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2022-09-30</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Alle 4 Schrauben und teile der Kettbaumlagerun...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Kettbaum</td>\n",
|
||
" <td>2022-09-30</td>\n",
|
||
" <td>Intern UTT - Reparatur</td>\n",
|
||
" <td>Neues Teil eingebaut und altes repariert</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>2022-09-30</td>\n",
|
||
" <td>2022-09-30</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>128934</th>\n",
|
||
" <td>275188</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>00001, Ausrüstungsanlage 1,</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Waschmaschine</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2022-09-30</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Walzenlager WK 6 überprüfenauswechseln</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Lagereinheit (Wälzlager, Kugellager, etc.)</td>\n",
|
||
" <td>2022-10-04</td>\n",
|
||
" <td>Intern UTT - Reparatur</td>\n",
|
||
" <td>Lager getauscht</td>\n",
|
||
" <td>Ausrüstung</td>\n",
|
||
" <td>Ausrüstung</td>\n",
|
||
" <td>2022-10-04</td>\n",
|
||
" <td>2022-09-30</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>128935</th>\n",
|
||
" <td>275219</td>\n",
|
||
" <td>326</td>\n",
|
||
" <td>B38, Niederhubwagen,</td>\n",
|
||
" <td>32</td>\n",
|
||
" <td>Flurförderzeuge / Putzmaschine / Rasenmäher</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2022-10-03</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Befestigung Deckel für Batteriefach defekt Hal...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Flurförderzeug</td>\n",
|
||
" <td>2022-10-05</td>\n",
|
||
" <td>Intern UTT - Reparatur</td>\n",
|
||
" <td>Neue Gasfeder eingebaut</td>\n",
|
||
" <td>Warenschau</td>\n",
|
||
" <td>Warenschau</td>\n",
|
||
" <td>2022-10-04</td>\n",
|
||
" <td>2022-10-03</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>124008 rows × 20 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" VorgangsID ObjektID HObjektText \\\n",
|
||
"2 53 244 285 C, Webmaschine, SG 220 EMS \n",
|
||
"3 58 257 107, Webmaschine, OM 220 EOS \n",
|
||
"4 81 138 00138, Schärmaschine 9, \n",
|
||
"5 82 0 Warenschau allgemein \n",
|
||
"6 76 0 Neben der Türe \n",
|
||
"... ... ... ... \n",
|
||
"128931 518956 1708 01708, Betriebsfahrräder Schlosserei, \n",
|
||
"128932 275123 1654 WEBEREI ALLGEMEIN, Weberei allgemein, \n",
|
||
"128933 275125 1795 A054.S, Jacquardmaschine, \n",
|
||
"128934 275188 1 00001, Ausrüstungsanlage 1, \n",
|
||
"128935 275219 326 B38, Niederhubwagen, \n",
|
||
"\n",
|
||
" ObjektArtID ObjektArtText \\\n",
|
||
"2 5 Greifer-Webmaschine \n",
|
||
"3 3 Luft-Webmaschine \n",
|
||
"4 16 Schärmaschine \n",
|
||
"5 0 NaN \n",
|
||
"6 0 NaN \n",
|
||
"... ... ... \n",
|
||
"128931 57 Interne Wartungsobjekte \n",
|
||
"128932 90 UTT allgemein \n",
|
||
"128933 24 Stäubli-Jacquardmaschine \n",
|
||
"128934 1 Waschmaschine \n",
|
||
"128935 32 Flurförderzeuge / Putzmaschine / Rasenmäher \n",
|
||
"\n",
|
||
" VorgangsTypID VorgangsTypName VorgangsDatum \\\n",
|
||
"2 3 Reparaturauftrag (Portal) 2019-03-19 \n",
|
||
"3 3 Reparaturauftrag (Portal) 2019-03-21 \n",
|
||
"4 3 Reparaturauftrag (Portal) 2019-03-25 \n",
|
||
"5 3 Reparaturauftrag (Portal) 2019-03-25 \n",
|
||
"6 3 Reparaturauftrag (Portal) 2019-03-22 \n",
|
||
"... ... ... ... \n",
|
||
"128931 1 Wartung 2023-06-19 \n",
|
||
"128932 3 Reparaturauftrag (Portal) 2022-09-29 \n",
|
||
"128933 3 Reparaturauftrag (Portal) 2022-09-30 \n",
|
||
"128934 3 Reparaturauftrag (Portal) 2022-09-30 \n",
|
||
"128935 3 Reparaturauftrag (Portal) 2022-10-03 \n",
|
||
"\n",
|
||
" VorgangsStatusId VorgangsPrioritaet \\\n",
|
||
"2 5 0 \n",
|
||
"3 5 0 \n",
|
||
"4 5 0 \n",
|
||
"5 5 0 \n",
|
||
"6 5 0 \n",
|
||
"... ... ... \n",
|
||
"128931 5 0 \n",
|
||
"128932 5 0 \n",
|
||
"128933 5 0 \n",
|
||
"128934 5 1 \n",
|
||
"128935 5 0 \n",
|
||
"\n",
|
||
" VorgangsBeschreibung \\\n",
|
||
"2 Kupplung schleift \n",
|
||
"3 Gegengewicht wieder anbringen \n",
|
||
"4 da ist etwas gebrochen. Herr Heininger \n",
|
||
"5 Klappbügel Portalkran H31 defekt \n",
|
||
"6 Schraube nix mer gut \n",
|
||
"... ... \n",
|
||
"128931 2-wöchige Reinigung Sichtkontrolle Technische ... \n",
|
||
"128932 Adapter entfernen und Gewinde nachschneiden. \n",
|
||
"128933 Alle 4 Schrauben und teile der Kettbaumlagerun... \n",
|
||
"128934 Walzenlager WK 6 überprüfenauswechseln \n",
|
||
"128935 Befestigung Deckel für Batteriefach defekt Hal... \n",
|
||
"\n",
|
||
" VorgangsOrt \\\n",
|
||
"2 NaN \n",
|
||
"3 NaN \n",
|
||
"4 NaN \n",
|
||
"5 Warenschau allgemein \n",
|
||
"6 Neben der Türe \n",
|
||
"... ... \n",
|
||
"128931 NaN \n",
|
||
"128932 NaN \n",
|
||
"128933 NaN \n",
|
||
"128934 NaN \n",
|
||
"128935 NaN \n",
|
||
"\n",
|
||
" VorgangsArtText ErledigungsDatum \\\n",
|
||
"2 Kupplung defekt 2019-03-20 \n",
|
||
"3 Gegengewicht an der Webmaschine abgefallen 2019-03-21 \n",
|
||
"4 zentrale Bremsenverstellung linke Gatterseite ... 2019-03-25 \n",
|
||
"5 Allgemeine Reparaturarbeiten 2019-03-25 \n",
|
||
"6 Kettbaum 2019-03-25 \n",
|
||
"... ... ... \n",
|
||
"128931 02 Interne Reinigung / Pflege / Überprüfung 2023-06-19 \n",
|
||
"128932 Kettbaum-Adapter 2022-09-30 \n",
|
||
"128933 Kettbaum 2022-09-30 \n",
|
||
"128934 Lagereinheit (Wälzlager, Kugellager, etc.) 2022-10-04 \n",
|
||
"128935 Flurförderzeug 2022-10-05 \n",
|
||
"\n",
|
||
" ErledigungsArtText \\\n",
|
||
"2 Reparatur UTT \n",
|
||
"3 Reparatur UTT \n",
|
||
"4 Reparatur UTT \n",
|
||
"5 Reparatur UTT \n",
|
||
"6 Reparatur UTT \n",
|
||
"... ... \n",
|
||
"128931 Intern UTT - Prüfung \n",
|
||
"128932 Intern UTT - Reparatur \n",
|
||
"128933 Intern UTT - Reparatur \n",
|
||
"128934 Intern UTT - Reparatur \n",
|
||
"128935 Intern UTT - Reparatur \n",
|
||
"\n",
|
||
" ErledigungsBeschreibung \\\n",
|
||
"2 NaN \n",
|
||
"3 Schraube ausgebohrt\\nGegengewicht wieder angeb... \n",
|
||
"4 Bolzen gebrochen. Bolzen neu angefertig und di... \n",
|
||
"5 Feder ausgetauscht \n",
|
||
"6 Schrauben ausgebohrt\\t\\nGewinde nachgeschnitten\\t \n",
|
||
"... ... \n",
|
||
"128931 Reinigung & Sichtkontrolle (Technische Einric... \n",
|
||
"128932 mit schlosserei aufräumen \n",
|
||
"128933 Neues Teil eingebaut und altes repariert \n",
|
||
"128934 Lager getauscht \n",
|
||
"128935 Neue Gasfeder eingebaut \n",
|
||
"\n",
|
||
" MPMelderArbeitsplatz MPAbteilungBezeichnung Arbeitsbeginn \\\n",
|
||
"2 Weberei Weberei NaT \n",
|
||
"3 Weberei Weberei 2019-03-21 \n",
|
||
"4 Vorwerk Vorwerk 2019-03-25 \n",
|
||
"5 Warenschau Warenschau 2019-03-25 \n",
|
||
"6 Vorwerk Vorwerk 2019-03-25 \n",
|
||
"... ... ... ... \n",
|
||
"128931 NaN NaN 2023-06-19 \n",
|
||
"128932 Weberei Weberei 2022-09-30 \n",
|
||
"128933 Weberei Weberei 2022-09-30 \n",
|
||
"128934 Ausrüstung Ausrüstung 2022-10-04 \n",
|
||
"128935 Warenschau Warenschau 2022-10-04 \n",
|
||
"\n",
|
||
" ErstellungsDatum \n",
|
||
"2 2019-03-19 \n",
|
||
"3 2019-03-21 \n",
|
||
"4 2019-03-25 \n",
|
||
"5 2019-03-25 \n",
|
||
"6 2019-03-22 \n",
|
||
"... ... \n",
|
||
"128931 2023-03-14 \n",
|
||
"128932 2022-09-29 \n",
|
||
"128933 2022-09-30 \n",
|
||
"128934 2022-09-30 \n",
|
||
"128935 2022-10-03 \n",
|
||
"\n",
|
||
"[124008 rows x 20 columns]"
|
||
]
|
||
},
|
||
"execution_count": 16,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"base"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 17,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Einträge: 124008\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"descriptions = base['VorgangsBeschreibung']\n",
|
||
"print(f\"Einträge: {len(descriptions)}\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 18,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Anzahl Duplikate Vorgangsbeschreibungen: 117255\n",
|
||
"Anzahl einzigartiger Vorgangsbeschreibungen: 6753\n",
|
||
"Anteil einzigartiger Vorgangsbeschreibungen: 5.45 %\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"num_dupl_descr = descriptions.duplicated().sum()\n",
|
||
"uni_descr = descriptions.unique()\n",
|
||
"num_uni_descr = len(uni_descr)\n",
|
||
"\n",
|
||
"print(f\"Anzahl Duplikate Vorgangsbeschreibungen: {num_dupl_descr}\")\n",
|
||
"print(f\"Anzahl einzigartiger Vorgangsbeschreibungen: {num_uni_descr}\")\n",
|
||
"print(f\"Anteil einzigartiger Vorgangsbeschreibungen: {num_uni_descr / len(descriptions) * 100:.2f} %\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 19,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"if not LOAD_CALC_FILES:\n",
|
||
" cols = ['descr', 'len', 'num_occur', 'assoc_obj_ids', 'num_assoc_obj_ids']\n",
|
||
" descr_df = pd.DataFrame(columns=cols)\n",
|
||
" max_val = 0\n",
|
||
" text = None\n",
|
||
" index = 0\n",
|
||
"\n",
|
||
"\n",
|
||
" for idx, description in enumerate(uni_descr):\n",
|
||
" len_descr = len(description)\n",
|
||
" filt = base['VorgangsBeschreibung'] == description\n",
|
||
" temp = base[filt]\n",
|
||
" assoc_obj_ids = temp['ObjektID'].unique()\n",
|
||
" assoc_obj_ids = np.sort(assoc_obj_ids, kind='stable')\n",
|
||
" num_assoc_obj_ids = len(assoc_obj_ids)\n",
|
||
" num_dupl = filt.sum()\n",
|
||
" \n",
|
||
" conc_df = pd.DataFrame(data=[[\n",
|
||
" description,\n",
|
||
" len_descr,\n",
|
||
" num_dupl,\n",
|
||
" assoc_obj_ids,\n",
|
||
" num_assoc_obj_ids\n",
|
||
" ]], columns=cols)\n",
|
||
" \n",
|
||
" descr_df = pd.concat([descr_df, conc_df], ignore_index=True)\n",
|
||
" \n",
|
||
" if num_dupl > max_val:\n",
|
||
" max_val = num_dupl\n",
|
||
" index = idx\n",
|
||
" text = description\n",
|
||
" \n",
|
||
" temp1 = descr_df.sort_values(by='num_occur', ascending=False)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 20,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>descr</th>\n",
|
||
" <th>len</th>\n",
|
||
" <th>num_occur</th>\n",
|
||
" <th>assoc_obj_ids</th>\n",
|
||
" <th>num_assoc_obj_ids</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>161</th>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>66</td>\n",
|
||
" <td>92592</td>\n",
|
||
" <td>[0, 17, 41, 42, 43, 44, 45, 46, 47, 51, 52, 53...</td>\n",
|
||
" <td>206</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>33</th>\n",
|
||
" <td>Wöchentliche Sichtkontrolle Reinigung</td>\n",
|
||
" <td>37</td>\n",
|
||
" <td>1654</td>\n",
|
||
" <td>[301, 304, 305, 313, 314, 331, 332, 510, 511, ...</td>\n",
|
||
" <td>18</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>130</th>\n",
|
||
" <td>Tägliche Überprüfung der Ölabscheider</td>\n",
|
||
" <td>37</td>\n",
|
||
" <td>1616</td>\n",
|
||
" <td>[0, 970, 2134, 2137]</td>\n",
|
||
" <td>4</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>159</th>\n",
|
||
" <td>Wöchentliche Kontrolle der WC-Anlagen</td>\n",
|
||
" <td>37</td>\n",
|
||
" <td>1265</td>\n",
|
||
" <td>[1352, 1353, 1354, 1684, 1685, 1686, 1687, 168...</td>\n",
|
||
" <td>11</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>139</th>\n",
|
||
" <td>Halbjährliche Kontrolle des Stabbreithalters</td>\n",
|
||
" <td>44</td>\n",
|
||
" <td>687</td>\n",
|
||
" <td>[51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 6...</td>\n",
|
||
" <td>166</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2665</th>\n",
|
||
" <td>Überprüfung der Y-Achse Schneidbrücke am LC 2 ...</td>\n",
|
||
" <td>176</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[20]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2664</th>\n",
|
||
" <td>Luftschlauch muss ausgetauscht werden. Ist und...</td>\n",
|
||
" <td>195</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[1]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2663</th>\n",
|
||
" <td>Riemenscheibe tauschen auf 650 UPM</td>\n",
|
||
" <td>34</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[74]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2660</th>\n",
|
||
" <td>Durchführung: Sollwert: 20 0,1g</td>\n",
|
||
" <td>31</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[1746]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6752</th>\n",
|
||
" <td>Befestigung Deckel für Batteriefach defekt Hal...</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[326]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>6753 rows × 5 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" descr len num_occur \\\n",
|
||
"161 Tägliche Wartungstätigkeiten nach Vorgabe des ... 66 92592 \n",
|
||
"33 Wöchentliche Sichtkontrolle Reinigung 37 1654 \n",
|
||
"130 Tägliche Überprüfung der Ölabscheider 37 1616 \n",
|
||
"159 Wöchentliche Kontrolle der WC-Anlagen 37 1265 \n",
|
||
"139 Halbjährliche Kontrolle des Stabbreithalters 44 687 \n",
|
||
"... ... ... ... \n",
|
||
"2665 Überprüfung der Y-Achse Schneidbrücke am LC 2 ... 176 1 \n",
|
||
"2664 Luftschlauch muss ausgetauscht werden. Ist und... 195 1 \n",
|
||
"2663 Riemenscheibe tauschen auf 650 UPM 34 1 \n",
|
||
"2660 Durchführung: Sollwert: 20 0,1g 31 1 \n",
|
||
"6752 Befestigung Deckel für Batteriefach defekt Hal... 99 1 \n",
|
||
"\n",
|
||
" assoc_obj_ids num_assoc_obj_ids \n",
|
||
"161 [0, 17, 41, 42, 43, 44, 45, 46, 47, 51, 52, 53... 206 \n",
|
||
"33 [301, 304, 305, 313, 314, 331, 332, 510, 511, ... 18 \n",
|
||
"130 [0, 970, 2134, 2137] 4 \n",
|
||
"159 [1352, 1353, 1354, 1684, 1685, 1686, 1687, 168... 11 \n",
|
||
"139 [51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 6... 166 \n",
|
||
"... ... ... \n",
|
||
"2665 [20] 1 \n",
|
||
"2664 [1] 1 \n",
|
||
"2663 [74] 1 \n",
|
||
"2660 [1746] 1 \n",
|
||
"6752 [326] 1 \n",
|
||
"\n",
|
||
"[6753 rows x 5 columns]"
|
||
]
|
||
},
|
||
"execution_count": 20,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp1"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 21,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# save/load dataframe\n",
|
||
"FILE_PATH = 'VorgangsBeschreibung_analyse_1.fth'\n",
|
||
"if LOAD_CALC_FILES:\n",
|
||
" temp1 = pd.read_feather(FILE_PATH)\n",
|
||
" temp1 = temp1.set_index('index')\n",
|
||
"else:\n",
|
||
" save_df = temp1.reset_index()\n",
|
||
" save_df.to_feather(FILE_PATH)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 22,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"filt = temp1['descr'].str.contains('3-monatlich')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 23,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>descr</th>\n",
|
||
" <th>len</th>\n",
|
||
" <th>num_occur</th>\n",
|
||
" <th>assoc_obj_ids</th>\n",
|
||
" <th>num_assoc_obj_ids</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>476</th>\n",
|
||
" <td>3-monatliche Sichtkontrolle Reinigung</td>\n",
|
||
" <td>37</td>\n",
|
||
" <td>222</td>\n",
|
||
" <td>[883, 1196, 1197, 1198, 1199, 1201, 1202, 1203...</td>\n",
|
||
" <td>18</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>671</th>\n",
|
||
" <td>3-monatliche Kontrolle</td>\n",
|
||
" <td>22</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>[2021, 2045]</td>\n",
|
||
" <td>2</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>303</th>\n",
|
||
" <td>3-monatliche Überprüfung durch Firma Siemens</td>\n",
|
||
" <td>44</td>\n",
|
||
" <td>16</td>\n",
|
||
" <td>[2029]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1055</th>\n",
|
||
" <td>3-monatliche Kontrolle der Wasserfilter, bei B...</td>\n",
|
||
" <td>186</td>\n",
|
||
" <td>16</td>\n",
|
||
" <td>[1175, 1176]</td>\n",
|
||
" <td>2</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>914</th>\n",
|
||
" <td>3-monatliche Überprüfung der Telefonanlage</td>\n",
|
||
" <td>42</td>\n",
|
||
" <td>14</td>\n",
|
||
" <td>[2035]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>281</th>\n",
|
||
" <td>3-monatliche Überprüfung der Torsprechanlage</td>\n",
|
||
" <td>44</td>\n",
|
||
" <td>14</td>\n",
|
||
" <td>[2037]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>280</th>\n",
|
||
" <td>3-monatliche Überprüfung der Sicherheitslichts...</td>\n",
|
||
" <td>111</td>\n",
|
||
" <td>14</td>\n",
|
||
" <td>[2046]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>658</th>\n",
|
||
" <td>3-monatliche Sichtkontrolle der Not- Sicherhei...</td>\n",
|
||
" <td>76</td>\n",
|
||
" <td>13</td>\n",
|
||
" <td>[2042]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>279</th>\n",
|
||
" <td>3-monatliche Überprüfung der Regalsicherungsan...</td>\n",
|
||
" <td>84</td>\n",
|
||
" <td>13</td>\n",
|
||
" <td>[2047]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3234</th>\n",
|
||
" <td>3-monatliche Überprüfung der Personen-Überwach...</td>\n",
|
||
" <td>72</td>\n",
|
||
" <td>11</td>\n",
|
||
" <td>[2040]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>608</th>\n",
|
||
" <td>3-monatliche Kontrolle der optischen Alarmgebe...</td>\n",
|
||
" <td>61</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>[2041]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>137</th>\n",
|
||
" <td>3-monatliche Überprüfung des Abwassers durch P...</td>\n",
|
||
" <td>70</td>\n",
|
||
" <td>8</td>\n",
|
||
" <td>[958]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2704</th>\n",
|
||
" <td>3-monatliche Reinigung</td>\n",
|
||
" <td>22</td>\n",
|
||
" <td>8</td>\n",
|
||
" <td>[903, 905]</td>\n",
|
||
" <td>2</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>995</th>\n",
|
||
" <td>3-monatliche Kontrolle der Erste-Hilfe-Kästen ...</td>\n",
|
||
" <td>103</td>\n",
|
||
" <td>7</td>\n",
|
||
" <td>[2456]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3207</th>\n",
|
||
" <td>3-monatliche Sichtkontrolle der Mittelspannung...</td>\n",
|
||
" <td>89</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>[2026]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3551</th>\n",
|
||
" <td>3-monatliche Überprüfung der Uhrenanlagen Betr...</td>\n",
|
||
" <td>84</td>\n",
|
||
" <td>4</td>\n",
|
||
" <td>[2034]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2939</th>\n",
|
||
" <td>3-monatliche Sichtkontrolle der Starkstrom-Anl...</td>\n",
|
||
" <td>75</td>\n",
|
||
" <td>4</td>\n",
|
||
" <td>[2021]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1001</th>\n",
|
||
" <td>3-monatliche Kontrolle des Seils eventueller A...</td>\n",
|
||
" <td>54</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>[838]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4054</th>\n",
|
||
" <td>3-monatliche Überprüfung des Abwassers durch P...</td>\n",
|
||
" <td>103</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>[958]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3587</th>\n",
|
||
" <td>3-monatliche Sichtkontrolle der optischen Alar...</td>\n",
|
||
" <td>66</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>[2041]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4284</th>\n",
|
||
" <td>3-monatliche Überprüfung des Abwassers durch P...</td>\n",
|
||
" <td>142</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[958]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6157</th>\n",
|
||
" <td>3-monatliche Überprüfung des Abwassers durch P...</td>\n",
|
||
" <td>125</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[958]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5333</th>\n",
|
||
" <td>3-monatliche Kontrolle des Seils eventueller A...</td>\n",
|
||
" <td>995</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[838]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5743</th>\n",
|
||
" <td>3-monatliche Überprüfung durch Firma Siemens. ...</td>\n",
|
||
" <td>110</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[2029]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" descr len num_occur \\\n",
|
||
"476 3-monatliche Sichtkontrolle Reinigung 37 222 \n",
|
||
"671 3-monatliche Kontrolle 22 20 \n",
|
||
"303 3-monatliche Überprüfung durch Firma Siemens 44 16 \n",
|
||
"1055 3-monatliche Kontrolle der Wasserfilter, bei B... 186 16 \n",
|
||
"914 3-monatliche Überprüfung der Telefonanlage 42 14 \n",
|
||
"281 3-monatliche Überprüfung der Torsprechanlage 44 14 \n",
|
||
"280 3-monatliche Überprüfung der Sicherheitslichts... 111 14 \n",
|
||
"658 3-monatliche Sichtkontrolle der Not- Sicherhei... 76 13 \n",
|
||
"279 3-monatliche Überprüfung der Regalsicherungsan... 84 13 \n",
|
||
"3234 3-monatliche Überprüfung der Personen-Überwach... 72 11 \n",
|
||
"608 3-monatliche Kontrolle der optischen Alarmgebe... 61 10 \n",
|
||
"137 3-monatliche Überprüfung des Abwassers durch P... 70 8 \n",
|
||
"2704 3-monatliche Reinigung 22 8 \n",
|
||
"995 3-monatliche Kontrolle der Erste-Hilfe-Kästen ... 103 7 \n",
|
||
"3207 3-monatliche Sichtkontrolle der Mittelspannung... 89 5 \n",
|
||
"3551 3-monatliche Überprüfung der Uhrenanlagen Betr... 84 4 \n",
|
||
"2939 3-monatliche Sichtkontrolle der Starkstrom-Anl... 75 4 \n",
|
||
"1001 3-monatliche Kontrolle des Seils eventueller A... 54 2 \n",
|
||
"4054 3-monatliche Überprüfung des Abwassers durch P... 103 2 \n",
|
||
"3587 3-monatliche Sichtkontrolle der optischen Alar... 66 2 \n",
|
||
"4284 3-monatliche Überprüfung des Abwassers durch P... 142 1 \n",
|
||
"6157 3-monatliche Überprüfung des Abwassers durch P... 125 1 \n",
|
||
"5333 3-monatliche Kontrolle des Seils eventueller A... 995 1 \n",
|
||
"5743 3-monatliche Überprüfung durch Firma Siemens. ... 110 1 \n",
|
||
"\n",
|
||
" assoc_obj_ids num_assoc_obj_ids \n",
|
||
"476 [883, 1196, 1197, 1198, 1199, 1201, 1202, 1203... 18 \n",
|
||
"671 [2021, 2045] 2 \n",
|
||
"303 [2029] 1 \n",
|
||
"1055 [1175, 1176] 2 \n",
|
||
"914 [2035] 1 \n",
|
||
"281 [2037] 1 \n",
|
||
"280 [2046] 1 \n",
|
||
"658 [2042] 1 \n",
|
||
"279 [2047] 1 \n",
|
||
"3234 [2040] 1 \n",
|
||
"608 [2041] 1 \n",
|
||
"137 [958] 1 \n",
|
||
"2704 [903, 905] 2 \n",
|
||
"995 [2456] 1 \n",
|
||
"3207 [2026] 1 \n",
|
||
"3551 [2034] 1 \n",
|
||
"2939 [2021] 1 \n",
|
||
"1001 [838] 1 \n",
|
||
"4054 [958] 1 \n",
|
||
"3587 [2041] 1 \n",
|
||
"4284 [958] 1 \n",
|
||
"6157 [958] 1 \n",
|
||
"5333 [838] 1 \n",
|
||
"5743 [2029] 1 "
|
||
]
|
||
},
|
||
"execution_count": 23,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"test2 = temp1.loc[filt,:]\n",
|
||
"test2"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"def pre_clean_spell_check(string: str) -> str:\n",
|
||
" \n",
|
||
" for char in SPELL_CHECK_NON_CHARS:\n",
|
||
" string = string.replace(char, ' ')\n",
|
||
" \n",
|
||
" # remove spaces at the beginning and the end\n",
|
||
" string = string.strip()\n",
|
||
" \n",
|
||
" return string\n",
|
||
"\n",
|
||
"\n",
|
||
"test = temp1['descr'].map(pre_clean_spell_check)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 24,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"objs = temp1.loc[140, 'assoc_obj_ids']"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 25,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([7], dtype=int64)"
|
||
]
|
||
},
|
||
"execution_count": 25,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"objs"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 26,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>VorgangsID</th>\n",
|
||
" <th>ObjektID</th>\n",
|
||
" <th>HObjektText</th>\n",
|
||
" <th>ObjektArtID</th>\n",
|
||
" <th>ObjektArtText</th>\n",
|
||
" <th>VorgangsTypID</th>\n",
|
||
" <th>VorgangsTypName</th>\n",
|
||
" <th>VorgangsDatum</th>\n",
|
||
" <th>VorgangsStatusId</th>\n",
|
||
" <th>VorgangsPrioritaet</th>\n",
|
||
" <th>VorgangsBeschreibung</th>\n",
|
||
" <th>VorgangsOrt</th>\n",
|
||
" <th>VorgangsArtText</th>\n",
|
||
" <th>ErledigungsDatum</th>\n",
|
||
" <th>ErledigungsArtText</th>\n",
|
||
" <th>ErledigungsBeschreibung</th>\n",
|
||
" <th>MPMelderArbeitsplatz</th>\n",
|
||
" <th>MPAbteilungBezeichnung</th>\n",
|
||
" <th>Arbeitsbeginn</th>\n",
|
||
" <th>ErstellungsDatum</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>166</th>\n",
|
||
" <td>111649</td>\n",
|
||
" <td>7</td>\n",
|
||
" <td>00007, Ausrüstung 2,</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>Beschichtungsmaschinen</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2021-03-17</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>-Bereich Aufwicklung, Bogenwalze Madenschraube...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Walze (mechanischer Defekt)</td>\n",
|
||
" <td>2021-03-17</td>\n",
|
||
" <td>Intern UTT - Reparatur</td>\n",
|
||
" <td>Madenschrauben angezogen</td>\n",
|
||
" <td>Ausrüstung</td>\n",
|
||
" <td>Ausrüstung</td>\n",
|
||
" <td>2021-03-17</td>\n",
|
||
" <td>2021-03-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>222</th>\n",
|
||
" <td>133856</td>\n",
|
||
" <td>7</td>\n",
|
||
" <td>00007, Ausrüstung 2,</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>Beschichtungsmaschinen</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2021-07-27</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>2 Keilriemen gerissen- Bereich Abluftventilato...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Antriebsriemen (Keilriemen / Zahnriemen / Flac...</td>\n",
|
||
" <td>2021-07-27</td>\n",
|
||
" <td>Intern UTT - Reparatur</td>\n",
|
||
" <td>die Keilriemen SPA 1282 aus Neu Ulm geholt und...</td>\n",
|
||
" <td>Ausrüstung</td>\n",
|
||
" <td>Ausrüstung</td>\n",
|
||
" <td>2021-07-27</td>\n",
|
||
" <td>2021-07-27</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>240</th>\n",
|
||
" <td>140704</td>\n",
|
||
" <td>7</td>\n",
|
||
" <td>00007, Ausrüstung 2,</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>Beschichtungsmaschinen</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2021-09-29</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wir benötigen einen weiteren Ersatzteilschrank...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Allgemeine Reparaturarbeiten</td>\n",
|
||
" <td>2021-10-04</td>\n",
|
||
" <td>Intern UTT - Montage</td>\n",
|
||
" <td>Wurde montiert</td>\n",
|
||
" <td>Ausrüstung</td>\n",
|
||
" <td>Ausrüstung</td>\n",
|
||
" <td>2021-10-04</td>\n",
|
||
" <td>2021-09-29</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>437</th>\n",
|
||
" <td>123811</td>\n",
|
||
" <td>7</td>\n",
|
||
" <td>00007, Ausrüstung 2,</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>Beschichtungsmaschinen</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2021-05-06</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>bitte dringend 10l Eimer zum Silikon versenden...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Maschineninfrastruktur</td>\n",
|
||
" <td>2021-05-06</td>\n",
|
||
" <td>Intern UTT - Wartung</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Ausrüstung</td>\n",
|
||
" <td>Ausrüstung</td>\n",
|
||
" <td>2021-05-06</td>\n",
|
||
" <td>2021-05-06</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>439</th>\n",
|
||
" <td>107885</td>\n",
|
||
" <td>7</td>\n",
|
||
" <td>00007, Ausrüstung 2,</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>Beschichtungsmaschinen</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2021-07-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Monatliche Kontrolle des Flusen-Absaugrohrs</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Maschinen-Wartung monatlich</td>\n",
|
||
" <td>2021-06-28</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaT</td>\n",
|
||
" <td>2021-03-03</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>128396</th>\n",
|
||
" <td>531424</td>\n",
|
||
" <td>7</td>\n",
|
||
" <td>00007, Ausrüstung 2,</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>Beschichtungsmaschinen</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2023-05-08</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>KKT Chiller Auslauf Störung. Füllstand Min. STOP</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Allgemeine Reparaturarbeiten</td>\n",
|
||
" <td>2023-05-08</td>\n",
|
||
" <td>Intern UTT - Reparatur</td>\n",
|
||
" <td>Kühlflüssigkeit aufgefüllt und Filter gewechse...</td>\n",
|
||
" <td>Ausrüstung 2</td>\n",
|
||
" <td>Ausrüstung</td>\n",
|
||
" <td>2023-05-08</td>\n",
|
||
" <td>2023-05-08</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>128446</th>\n",
|
||
" <td>530613</td>\n",
|
||
" <td>7</td>\n",
|
||
" <td>00007, Ausrüstung 2,</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>Beschichtungsmaschinen</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2023-05-30</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Monatliche Überprüfung der Gasleitung mit dem ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>01 Interne Reinigung / Pflege / Überprüfung</td>\n",
|
||
" <td>2023-06-05</td>\n",
|
||
" <td>Intern UTT - Prüfung</td>\n",
|
||
" <td>Dichtheitsprüfung der Gasleitungen</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2023-06-05</td>\n",
|
||
" <td>2023-04-24</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>128563</th>\n",
|
||
" <td>580234</td>\n",
|
||
" <td>7</td>\n",
|
||
" <td>00007, Ausrüstung 2,</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>Beschichtungsmaschinen</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2023-05-30</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Mischer für Beschichtungsanlage bitte ausbrenn...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Allgemeine Reparaturarbeiten</td>\n",
|
||
" <td>2023-05-30</td>\n",
|
||
" <td>Intern UTT - Reparatur</td>\n",
|
||
" <td>erledigt</td>\n",
|
||
" <td>Ausrüstung 2, Kombianlage</td>\n",
|
||
" <td>Ausrüstung</td>\n",
|
||
" <td>2023-05-30</td>\n",
|
||
" <td>2023-05-30</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>128636</th>\n",
|
||
" <td>586208</td>\n",
|
||
" <td>7</td>\n",
|
||
" <td>00007, Ausrüstung 2,</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>Beschichtungsmaschinen</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2023-06-12</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Haken im Kran Auslauf defekt</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Allgemeine Reparaturarbeiten</td>\n",
|
||
" <td>2023-06-12</td>\n",
|
||
" <td>Intern UTT - Reparatur</td>\n",
|
||
" <td>Haken getauscht</td>\n",
|
||
" <td>Ausrüstung</td>\n",
|
||
" <td>Ausrüstung</td>\n",
|
||
" <td>2023-06-12</td>\n",
|
||
" <td>2023-06-12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>128915</th>\n",
|
||
" <td>261786</td>\n",
|
||
" <td>7</td>\n",
|
||
" <td>00007, Ausrüstung 2,</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>Beschichtungsmaschinen</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2023-05-30</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Kontrolle der Risiko-Ersatzteile</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Überprüfung Risikoersatzteile</td>\n",
|
||
" <td>2023-05-30</td>\n",
|
||
" <td>Intern UTT - Dokumentenkontrolle</td>\n",
|
||
" <td>erledigt.\\n</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2023-05-30</td>\n",
|
||
" <td>2022-06-30</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>272 rows × 20 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" VorgangsID ObjektID HObjektText ObjektArtID \\\n",
|
||
"166 111649 7 00007, Ausrüstung 2, 2 \n",
|
||
"222 133856 7 00007, Ausrüstung 2, 2 \n",
|
||
"240 140704 7 00007, Ausrüstung 2, 2 \n",
|
||
"437 123811 7 00007, Ausrüstung 2, 2 \n",
|
||
"439 107885 7 00007, Ausrüstung 2, 2 \n",
|
||
"... ... ... ... ... \n",
|
||
"128396 531424 7 00007, Ausrüstung 2, 2 \n",
|
||
"128446 530613 7 00007, Ausrüstung 2, 2 \n",
|
||
"128563 580234 7 00007, Ausrüstung 2, 2 \n",
|
||
"128636 586208 7 00007, Ausrüstung 2, 2 \n",
|
||
"128915 261786 7 00007, Ausrüstung 2, 2 \n",
|
||
"\n",
|
||
" ObjektArtText VorgangsTypID VorgangsTypName \\\n",
|
||
"166 Beschichtungsmaschinen 3 Reparaturauftrag (Portal) \n",
|
||
"222 Beschichtungsmaschinen 3 Reparaturauftrag (Portal) \n",
|
||
"240 Beschichtungsmaschinen 3 Reparaturauftrag (Portal) \n",
|
||
"437 Beschichtungsmaschinen 3 Reparaturauftrag (Portal) \n",
|
||
"439 Beschichtungsmaschinen 1 Wartung \n",
|
||
"... ... ... ... \n",
|
||
"128396 Beschichtungsmaschinen 3 Reparaturauftrag (Portal) \n",
|
||
"128446 Beschichtungsmaschinen 1 Wartung \n",
|
||
"128563 Beschichtungsmaschinen 3 Reparaturauftrag (Portal) \n",
|
||
"128636 Beschichtungsmaschinen 3 Reparaturauftrag (Portal) \n",
|
||
"128915 Beschichtungsmaschinen 1 Wartung \n",
|
||
"\n",
|
||
" VorgangsDatum VorgangsStatusId VorgangsPrioritaet \\\n",
|
||
"166 2021-03-17 5 1 \n",
|
||
"222 2021-07-27 5 1 \n",
|
||
"240 2021-09-29 5 1 \n",
|
||
"437 2021-05-06 5 1 \n",
|
||
"439 2021-07-01 5 1 \n",
|
||
"... ... ... ... \n",
|
||
"128396 2023-05-08 5 1 \n",
|
||
"128446 2023-05-30 5 1 \n",
|
||
"128563 2023-05-30 5 1 \n",
|
||
"128636 2023-06-12 5 1 \n",
|
||
"128915 2023-05-30 5 1 \n",
|
||
"\n",
|
||
" VorgangsBeschreibung VorgangsOrt \\\n",
|
||
"166 -Bereich Aufwicklung, Bogenwalze Madenschraube... NaN \n",
|
||
"222 2 Keilriemen gerissen- Bereich Abluftventilato... NaN \n",
|
||
"240 Wir benötigen einen weiteren Ersatzteilschrank... NaN \n",
|
||
"437 bitte dringend 10l Eimer zum Silikon versenden... NaN \n",
|
||
"439 Monatliche Kontrolle des Flusen-Absaugrohrs NaN \n",
|
||
"... ... ... \n",
|
||
"128396 KKT Chiller Auslauf Störung. Füllstand Min. STOP NaN \n",
|
||
"128446 Monatliche Überprüfung der Gasleitung mit dem ... NaN \n",
|
||
"128563 Mischer für Beschichtungsanlage bitte ausbrenn... NaN \n",
|
||
"128636 Haken im Kran Auslauf defekt NaN \n",
|
||
"128915 Kontrolle der Risiko-Ersatzteile NaN \n",
|
||
"\n",
|
||
" VorgangsArtText ErledigungsDatum \\\n",
|
||
"166 Walze (mechanischer Defekt) 2021-03-17 \n",
|
||
"222 Antriebsriemen (Keilriemen / Zahnriemen / Flac... 2021-07-27 \n",
|
||
"240 Allgemeine Reparaturarbeiten 2021-10-04 \n",
|
||
"437 Maschineninfrastruktur 2021-05-06 \n",
|
||
"439 Maschinen-Wartung monatlich 2021-06-28 \n",
|
||
"... ... ... \n",
|
||
"128396 Allgemeine Reparaturarbeiten 2023-05-08 \n",
|
||
"128446 01 Interne Reinigung / Pflege / Überprüfung 2023-06-05 \n",
|
||
"128563 Allgemeine Reparaturarbeiten 2023-05-30 \n",
|
||
"128636 Allgemeine Reparaturarbeiten 2023-06-12 \n",
|
||
"128915 Überprüfung Risikoersatzteile 2023-05-30 \n",
|
||
"\n",
|
||
" ErledigungsArtText \\\n",
|
||
"166 Intern UTT - Reparatur \n",
|
||
"222 Intern UTT - Reparatur \n",
|
||
"240 Intern UTT - Montage \n",
|
||
"437 Intern UTT - Wartung \n",
|
||
"439 Intern UTT - Sichtkontrolle \n",
|
||
"... ... \n",
|
||
"128396 Intern UTT - Reparatur \n",
|
||
"128446 Intern UTT - Prüfung \n",
|
||
"128563 Intern UTT - Reparatur \n",
|
||
"128636 Intern UTT - Reparatur \n",
|
||
"128915 Intern UTT - Dokumentenkontrolle \n",
|
||
"\n",
|
||
" ErledigungsBeschreibung \\\n",
|
||
"166 Madenschrauben angezogen \n",
|
||
"222 die Keilriemen SPA 1282 aus Neu Ulm geholt und... \n",
|
||
"240 Wurde montiert \n",
|
||
"437 NaN \n",
|
||
"439 NaN \n",
|
||
"... ... \n",
|
||
"128396 Kühlflüssigkeit aufgefüllt und Filter gewechse... \n",
|
||
"128446 Dichtheitsprüfung der Gasleitungen \n",
|
||
"128563 erledigt \n",
|
||
"128636 Haken getauscht \n",
|
||
"128915 erledigt.\\n \n",
|
||
"\n",
|
||
" MPMelderArbeitsplatz MPAbteilungBezeichnung Arbeitsbeginn \\\n",
|
||
"166 Ausrüstung Ausrüstung 2021-03-17 \n",
|
||
"222 Ausrüstung Ausrüstung 2021-07-27 \n",
|
||
"240 Ausrüstung Ausrüstung 2021-10-04 \n",
|
||
"437 Ausrüstung Ausrüstung 2021-05-06 \n",
|
||
"439 NaN NaN NaT \n",
|
||
"... ... ... ... \n",
|
||
"128396 Ausrüstung 2 Ausrüstung 2023-05-08 \n",
|
||
"128446 NaN NaN 2023-06-05 \n",
|
||
"128563 Ausrüstung 2, Kombianlage Ausrüstung 2023-05-30 \n",
|
||
"128636 Ausrüstung Ausrüstung 2023-06-12 \n",
|
||
"128915 NaN NaN 2023-05-30 \n",
|
||
"\n",
|
||
" ErstellungsDatum \n",
|
||
"166 2021-03-17 \n",
|
||
"222 2021-07-27 \n",
|
||
"240 2021-09-29 \n",
|
||
"437 2021-05-06 \n",
|
||
"439 2021-03-03 \n",
|
||
"... ... \n",
|
||
"128396 2023-05-08 \n",
|
||
"128446 2023-04-24 \n",
|
||
"128563 2023-05-30 \n",
|
||
"128636 2023-06-12 \n",
|
||
"128915 2022-06-30 \n",
|
||
"\n",
|
||
"[272 rows x 20 columns]"
|
||
]
|
||
},
|
||
"execution_count": 26,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"base.loc[base['ObjektID'] == objs[0],:]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 27,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>descr</th>\n",
|
||
" <th>len</th>\n",
|
||
" <th>num_occur</th>\n",
|
||
" <th>assoc_obj_ids</th>\n",
|
||
" <th>num_assoc_obj_ids</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>161</th>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>66</td>\n",
|
||
" <td>92592</td>\n",
|
||
" <td>[0, 17, 41, 42, 43, 44, 45, 46, 47, 51, 52, 53...</td>\n",
|
||
" <td>206</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>33</th>\n",
|
||
" <td>Wöchentliche Sichtkontrolle Reinigung</td>\n",
|
||
" <td>37</td>\n",
|
||
" <td>1654</td>\n",
|
||
" <td>[301, 304, 305, 313, 314, 331, 332, 510, 511, ...</td>\n",
|
||
" <td>18</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>130</th>\n",
|
||
" <td>Tägliche Überprüfung der Ölabscheider</td>\n",
|
||
" <td>37</td>\n",
|
||
" <td>1616</td>\n",
|
||
" <td>[0, 970, 2134, 2137]</td>\n",
|
||
" <td>4</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>159</th>\n",
|
||
" <td>Wöchentliche Kontrolle der WC-Anlagen</td>\n",
|
||
" <td>37</td>\n",
|
||
" <td>1265</td>\n",
|
||
" <td>[1352, 1353, 1354, 1684, 1685, 1686, 1687, 168...</td>\n",
|
||
" <td>11</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>139</th>\n",
|
||
" <td>Halbjährliche Kontrolle des Stabbreithalters</td>\n",
|
||
" <td>44</td>\n",
|
||
" <td>687</td>\n",
|
||
" <td>[51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 6...</td>\n",
|
||
" <td>166</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2665</th>\n",
|
||
" <td>Überprüfung der Y-Achse Schneidbrücke am LC 2 ...</td>\n",
|
||
" <td>176</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[20]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2664</th>\n",
|
||
" <td>Luftschlauch muss ausgetauscht werden. Ist und...</td>\n",
|
||
" <td>195</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[1]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2663</th>\n",
|
||
" <td>Riemenscheibe tauschen auf 650 UPM</td>\n",
|
||
" <td>34</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[74]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2660</th>\n",
|
||
" <td>Durchführung: Sollwert: 20 0,1g</td>\n",
|
||
" <td>31</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[1746]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6752</th>\n",
|
||
" <td>Befestigung Deckel für Batteriefach defekt Hal...</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[326]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>6753 rows × 5 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" descr len num_occur \\\n",
|
||
"161 Tägliche Wartungstätigkeiten nach Vorgabe des ... 66 92592 \n",
|
||
"33 Wöchentliche Sichtkontrolle Reinigung 37 1654 \n",
|
||
"130 Tägliche Überprüfung der Ölabscheider 37 1616 \n",
|
||
"159 Wöchentliche Kontrolle der WC-Anlagen 37 1265 \n",
|
||
"139 Halbjährliche Kontrolle des Stabbreithalters 44 687 \n",
|
||
"... ... ... ... \n",
|
||
"2665 Überprüfung der Y-Achse Schneidbrücke am LC 2 ... 176 1 \n",
|
||
"2664 Luftschlauch muss ausgetauscht werden. Ist und... 195 1 \n",
|
||
"2663 Riemenscheibe tauschen auf 650 UPM 34 1 \n",
|
||
"2660 Durchführung: Sollwert: 20 0,1g 31 1 \n",
|
||
"6752 Befestigung Deckel für Batteriefach defekt Hal... 99 1 \n",
|
||
"\n",
|
||
" assoc_obj_ids num_assoc_obj_ids \n",
|
||
"161 [0, 17, 41, 42, 43, 44, 45, 46, 47, 51, 52, 53... 206 \n",
|
||
"33 [301, 304, 305, 313, 314, 331, 332, 510, 511, ... 18 \n",
|
||
"130 [0, 970, 2134, 2137] 4 \n",
|
||
"159 [1352, 1353, 1354, 1684, 1685, 1686, 1687, 168... 11 \n",
|
||
"139 [51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 6... 166 \n",
|
||
"... ... ... \n",
|
||
"2665 [20] 1 \n",
|
||
"2664 [1] 1 \n",
|
||
"2663 [74] 1 \n",
|
||
"2660 [1746] 1 \n",
|
||
"6752 [326] 1 \n",
|
||
"\n",
|
||
"[6753 rows x 5 columns]"
|
||
]
|
||
},
|
||
"execution_count": 27,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp1"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 28,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'Tägliche Wartungstätigkeiten nach Vorgabe des Maschinenherstellers'"
|
||
]
|
||
},
|
||
"execution_count": 28,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp1.iat[0,0]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 29,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'Wöchentliche Sichtkontrolle Reinigung'"
|
||
]
|
||
},
|
||
"execution_count": 29,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp1.iat[1,0]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### spaCy"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 30,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'Durchführung: Sollwert: 20 0,1g'"
|
||
]
|
||
},
|
||
"execution_count": 30,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"string = temp1.iloc[-2,0]\n",
|
||
"#string = temp1.iloc[0,0]\n",
|
||
"string"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 31,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"string = 'Ich spiele jeden Tag mit den Kindern im Garten. Das ist schön.'\n",
|
||
"string = 'Die Maschine XYZ ist aufgrund einer Störung im Druckluftsystem defekt.'\n",
|
||
"#string = 'Wir benötigen das Werkzeug von Herr Stöppel, um das derzeit abzuarbeiten.Dies wird durch Herrn Strebe getan.'"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 32,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"doc = nlp(string)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 33,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"(11, 11)\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([[ 0, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3],\n",
|
||
" [ 0, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3],\n",
|
||
" [ 0, 0, 2, 3, 3, 3, 3, 3, 3, 3, 3],\n",
|
||
" [ 0, 0, 0, 3, 3, 3, 3, 3, 3, 3, 3],\n",
|
||
" [ 0, 0, 0, 0, 4, 4, 4, 4, 4, 3, 3],\n",
|
||
" [ 0, 0, 0, 0, 0, 5, 6, 6, 6, 3, 3],\n",
|
||
" [ 0, 0, 0, 0, 0, 0, 6, 6, 6, 3, 3],\n",
|
||
" [ 0, 0, 0, 0, 0, 0, 0, 7, 7, 3, 3],\n",
|
||
" [ 0, 0, 0, 0, 0, 0, 0, 0, 8, 3, 3],\n",
|
||
" [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 3],\n",
|
||
" [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10]])"
|
||
]
|
||
},
|
||
"execution_count": 33,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"lca_matrix = doc.get_lca_matrix()\n",
|
||
"print(lca_matrix.shape)\n",
|
||
"lca_matrix = np.triu(lca_matrix)\n",
|
||
"lca_matrix"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"nested children:\n",
|
||
"- [x] Gewichtung über Anzahl Erscheinungen\n",
|
||
"- [x] AUX-Wörter: evtl. alle aossoziierten Wörter in Beziehung setzen\n",
|
||
"- [ ] Dual Link zwischen zwei Wörtern eines Baums (sinnvoll?)\n",
|
||
" - nicht wirklich sinnvoll, da einfache Verbindung durch Gewicht schon berücksichtigt\n",
|
||
" - schlussendlich würde jede Verbindung im Gewicht verdoppelt werden"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 34,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# simulate occurence counter\n",
|
||
"OCC_COUNTER = 10"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 35,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"SPELL_CHECK_NON_CHARS = set([' ', '.', ',', ';', ':', '-'])\n",
|
||
"\n",
|
||
"def pre_clean_word(string: str) -> str:\n",
|
||
" \n",
|
||
" pattern = r'[^A-Za-zäöüÄÖÜ]+'\n",
|
||
" string = re.sub(pattern, '', string)\n",
|
||
" \"\"\"\n",
|
||
" for char in SPELL_CHECK_NON_CHARS:\n",
|
||
" string = string.replace(char, '')\n",
|
||
" \"\"\"\n",
|
||
" \n",
|
||
" return string\n",
|
||
"\n",
|
||
"# https://stackoverflow.com/questions/25341945/check-if-string-has-date-any-format \n",
|
||
"def is_str_date(string, fuzzy=False):\n",
|
||
" \n",
|
||
" try:\n",
|
||
" parse(string, fuzzy=fuzzy)\n",
|
||
" return True\n",
|
||
" except ValueError:\n",
|
||
" return False\n",
|
||
"\n",
|
||
"\n",
|
||
"def obtain_sub_tree(token):\n",
|
||
" # check if token is a POS of interest\n",
|
||
" descendants = list(token.subtree)\n",
|
||
" descendants.remove(token)\n",
|
||
" logger.debug(f'Token >>{token}<< has subtree >>{descendants}<<')\n",
|
||
" return descendants\n",
|
||
"\n",
|
||
"\n",
|
||
"def add_children_descendants(\n",
|
||
" parent,\n",
|
||
" weight,\n",
|
||
" connections,\n",
|
||
" unique_tokens,\n",
|
||
" children_sents,\n",
|
||
"):\n",
|
||
" # add child as key\n",
|
||
" if (parent.lemma_, parent.pos_) in connections:\n",
|
||
" connections[(parent.lemma_, parent.pos_)].append(children_sents)\n",
|
||
" #connections[parent.lemma_].append([descendant.lemma_, descendant])\n",
|
||
" else:\n",
|
||
" # do not add auxiliary words\n",
|
||
" if parent.pos_ != 'AUX':\n",
|
||
" unique_tokens.add(parent.lemma_)\n",
|
||
" connections[(parent.lemma_, parent.pos_)] = list()\n",
|
||
" connections[(parent.lemma_, parent.pos_)].append(children_sents)\n",
|
||
" #connections[parent.lemma_].append([descendant.lemma_, descendant])\n",
|
||
" \n",
|
||
" return None\n",
|
||
"\n",
|
||
"\n",
|
||
"def obtain_descendant_info(\n",
|
||
" doc,\n",
|
||
" weight,\n",
|
||
" POS_of_interest,\n",
|
||
" TAG_of_interest,\n",
|
||
" connections,\n",
|
||
" unique_tokens,\n",
|
||
" spell_check_candidates,\n",
|
||
" spell_check_whitelist,\n",
|
||
" spell_checker,\n",
|
||
" corrections,\n",
|
||
"):\n",
|
||
" global GENERAL_BLACKLIST\n",
|
||
" \n",
|
||
" # iterate over sentences\n",
|
||
" for sent in doc.sents:\n",
|
||
" # spell check list\n",
|
||
" spell_check_words = list()\n",
|
||
" \n",
|
||
" # iterate over tokens in one sentence\n",
|
||
" for token in sent:\n",
|
||
" \n",
|
||
" if not (token.pos_ in POS_of_interest or token.tag_ in TAG_of_interest):\n",
|
||
" continue\n",
|
||
" elif token.lemma_.lower() in GENERAL_BLACKLIST:\n",
|
||
" logger.debug(f'Eliminated parent >>{token}<< because of blacklist')\n",
|
||
" continue\n",
|
||
" \n",
|
||
" # spell check\n",
|
||
" if token.lemma_.lower() not in spell_check_whitelist:\n",
|
||
" word = pre_clean_word(string=token.lemma_.lower())\n",
|
||
" if word in corrections:\n",
|
||
" word = corrections[word]\n",
|
||
" elif not word.isdigit():\n",
|
||
" spell_check_words.append(word)\n",
|
||
" \n",
|
||
" descendants = obtain_sub_tree(token=token)\n",
|
||
" \n",
|
||
" # iterate over all children if there are any\n",
|
||
" if descendants is not None:\n",
|
||
" # list with all children in the current sentence\n",
|
||
" children_sents = list()\n",
|
||
" \n",
|
||
" for child in descendants:\n",
|
||
" logger.debug(f'Token is >>{token}<< with child >>{child}<< and POS {child.pos_}')\n",
|
||
" \n",
|
||
" # elimnate cases of cross-references with verbs\n",
|
||
" if ((token.pos_ == 'AUX' or token.pos_ == 'VERB') and\n",
|
||
" (child.pos_ == 'AUX' or child.pos_ == 'VERB')):\n",
|
||
" continue\n",
|
||
" elif not (child.pos_ in POS_of_interest or child.tag_ in TAG_of_interest):\n",
|
||
" continue\n",
|
||
" elif child.lemma_.lower() in GENERAL_BLACKLIST:\n",
|
||
" logger.debug(f'Eliminated child >>{child}<< because of blacklist')\n",
|
||
" continue\n",
|
||
" \n",
|
||
" if (child not in DESC_BLACKLIST and\n",
|
||
" not is_str_date(string=child.text)):\n",
|
||
" children_sents.append((child.lemma_, weight))\n",
|
||
" \n",
|
||
" if child.lemma_ not in unique_tokens:\n",
|
||
" unique_tokens.add(child.lemma_)\n",
|
||
" \n",
|
||
" if child.lemma_.lower() not in spell_check_whitelist:\n",
|
||
" word = pre_clean_word(string=child.lemma_.lower())\n",
|
||
" if word in corrections:\n",
|
||
" word = corrections[word]\n",
|
||
" elif not word.isdigit():\n",
|
||
" spell_check_words.append(word)\n",
|
||
" \n",
|
||
" # add list of children for current parent if not empty\n",
|
||
" if children_sents:\n",
|
||
" add_children_descendants(\n",
|
||
" parent=token,\n",
|
||
" weight=weight,\n",
|
||
" connections=connections,\n",
|
||
" unique_tokens=unique_tokens,\n",
|
||
" children_sents=children_sents,\n",
|
||
" )\n",
|
||
"\n",
|
||
" misspelled_candidates = spell_checker.unknown(spell_check_words)\n",
|
||
" spell_check_candidates.update(misspelled_candidates)\n",
|
||
" \n",
|
||
" \n",
|
||
" return None"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 36,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Token</th>\n",
|
||
" <th>Lemma</th>\n",
|
||
" <th>POS</th>\n",
|
||
" <th>Tag</th>\n",
|
||
" <th>Dep</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>Die</td>\n",
|
||
" <td>der</td>\n",
|
||
" <td>DET</td>\n",
|
||
" <td>ART</td>\n",
|
||
" <td>nk</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>Maschine</td>\n",
|
||
" <td>Maschine</td>\n",
|
||
" <td>NOUN</td>\n",
|
||
" <td>NN</td>\n",
|
||
" <td>sb</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>XYZ</td>\n",
|
||
" <td>XYZ</td>\n",
|
||
" <td>PROPN</td>\n",
|
||
" <td>NE</td>\n",
|
||
" <td>nk</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>ist</td>\n",
|
||
" <td>sein</td>\n",
|
||
" <td>AUX</td>\n",
|
||
" <td>VAFIN</td>\n",
|
||
" <td>ROOT</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>aufgrund</td>\n",
|
||
" <td>aufgrund</td>\n",
|
||
" <td>ADP</td>\n",
|
||
" <td>APPR</td>\n",
|
||
" <td>mo</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>einer</td>\n",
|
||
" <td>ein</td>\n",
|
||
" <td>DET</td>\n",
|
||
" <td>ART</td>\n",
|
||
" <td>nk</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>Störung</td>\n",
|
||
" <td>Störung</td>\n",
|
||
" <td>NOUN</td>\n",
|
||
" <td>NN</td>\n",
|
||
" <td>nk</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7</th>\n",
|
||
" <td>im</td>\n",
|
||
" <td>in</td>\n",
|
||
" <td>ADP</td>\n",
|
||
" <td>APPRART</td>\n",
|
||
" <td>mnr</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8</th>\n",
|
||
" <td>Druckluftsystem</td>\n",
|
||
" <td>Druckluftsystem</td>\n",
|
||
" <td>NOUN</td>\n",
|
||
" <td>NN</td>\n",
|
||
" <td>nk</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9</th>\n",
|
||
" <td>defekt</td>\n",
|
||
" <td>defekt</td>\n",
|
||
" <td>ADV</td>\n",
|
||
" <td>ADJD</td>\n",
|
||
" <td>pd</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10</th>\n",
|
||
" <td>.</td>\n",
|
||
" <td>--</td>\n",
|
||
" <td>PUNCT</td>\n",
|
||
" <td>$.</td>\n",
|
||
" <td>punct</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Token Lemma POS Tag Dep\n",
|
||
"0 Die der DET ART nk\n",
|
||
"1 Maschine Maschine NOUN NN sb\n",
|
||
"2 XYZ XYZ PROPN NE nk\n",
|
||
"3 ist sein AUX VAFIN ROOT\n",
|
||
"4 aufgrund aufgrund ADP APPR mo\n",
|
||
"5 einer ein DET ART nk\n",
|
||
"6 Störung Störung NOUN NN nk\n",
|
||
"7 im in ADP APPRART mnr\n",
|
||
"8 Druckluftsystem Druckluftsystem NOUN NN nk\n",
|
||
"9 defekt defekt ADV ADJD pd\n",
|
||
"10 . -- PUNCT $. punct"
|
||
]
|
||
},
|
||
"execution_count": 36,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd.DataFrame({\"Token\": [token.text for token in doc],\n",
|
||
" \"Lemma\": [token.lemma_ for token in doc],\n",
|
||
" \"POS\": [token.pos_ for token in doc],\n",
|
||
" \"Tag\": [token.tag_ for token in doc],\n",
|
||
" \"Dep\": [token.dep_ for token in doc]})"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 37,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"def obtain_adj_matrix(unique_tokens, connections):\n",
|
||
"\n",
|
||
" adj_mat = pd.DataFrame(\n",
|
||
" data=0, \n",
|
||
" columns=list(unique_tokens), \n",
|
||
" index=list(unique_tokens),\n",
|
||
" dtype=np.uint32,\n",
|
||
" )\n",
|
||
" \n",
|
||
" for (pred, POS), descendants_list in connections.items():\n",
|
||
" #print(f'{pred=}, {descendants=}')\n",
|
||
" \n",
|
||
" for descendants in descendants_list:\n",
|
||
" #print(f'{descendants}')\n",
|
||
" \n",
|
||
" if POS != 'AUX':\n",
|
||
" for (desc, weight) in descendants:\n",
|
||
" adj_mat.at[pred, desc] += weight\n",
|
||
" \n",
|
||
" else:\n",
|
||
" if len(descendants) > 1:\n",
|
||
" # if auxiliary word, make connection between all associated words\n",
|
||
" combs = combinations(descendants, r=2)\n",
|
||
" \n",
|
||
" for comb in combs:\n",
|
||
" # comb is tuple ((word_1, weight), (word_2, weight))\n",
|
||
" weight = comb[0][1]\n",
|
||
" word_1 = comb[0][0]\n",
|
||
" word_2 = comb[1][0]\n",
|
||
" \n",
|
||
" \"\"\"\n",
|
||
" if ((word_1 == 'Eigenverantwortlichkeit' or word_1 == 'neu') and\n",
|
||
" (word_2 == 'Eigenverantwortlichkeit' or word_2 == 'neu')):\n",
|
||
" print(f'Hello from {pred=} with {descendants=}')\n",
|
||
" \"\"\"\n",
|
||
" \n",
|
||
" adj_mat.at[word_1, word_2] += weight\n",
|
||
" \n",
|
||
" \n",
|
||
" return adj_mat\n",
|
||
"\n",
|
||
"\n",
|
||
"def make_undir_adj_matrix(adj_mat):\n",
|
||
" \n",
|
||
" adj_mat_undir = adj_mat.copy()\n",
|
||
" arr = adj_mat_undir.to_numpy()\n",
|
||
" arr_upper = np.triu(arr)\n",
|
||
" arr_lower = np.tril(arr)\n",
|
||
" arr_lower = np.rot90(np.fliplr(arr_lower))\n",
|
||
" arr_new = arr_lower + arr_upper\n",
|
||
" \n",
|
||
" adj_mat_undir.loc[:] = arr_new\n",
|
||
" \n",
|
||
" return adj_mat_undir"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 38,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<span class=\"tex2jax_ignore\"><svg xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" xml:lang=\"de\" id=\"ef1bf216422b43369e5c4ba89cb854dd-0\" class=\"displacy\" width=\"1800\" height=\"399.5\" direction=\"ltr\" style=\"max-width: none; height: 399.5px; color: #000000; background: #ffffff; font-family: Arial; direction: ltr\">\n",
|
||
"<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"309.5\">\n",
|
||
" <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"50\">Die</tspan>\n",
|
||
" <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"50\">DET</tspan>\n",
|
||
"</text>\n",
|
||
"\n",
|
||
"<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"309.5\">\n",
|
||
" <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"225\">Maschine</tspan>\n",
|
||
" <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"225\">NOUN</tspan>\n",
|
||
"</text>\n",
|
||
"\n",
|
||
"<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"309.5\">\n",
|
||
" <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"400\">XYZ</tspan>\n",
|
||
" <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"400\">PROPN</tspan>\n",
|
||
"</text>\n",
|
||
"\n",
|
||
"<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"309.5\">\n",
|
||
" <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"575\">ist</tspan>\n",
|
||
" <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"575\">AUX</tspan>\n",
|
||
"</text>\n",
|
||
"\n",
|
||
"<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"309.5\">\n",
|
||
" <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"750\">aufgrund</tspan>\n",
|
||
" <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"750\">ADP</tspan>\n",
|
||
"</text>\n",
|
||
"\n",
|
||
"<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"309.5\">\n",
|
||
" <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"925\">einer</tspan>\n",
|
||
" <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"925\">DET</tspan>\n",
|
||
"</text>\n",
|
||
"\n",
|
||
"<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"309.5\">\n",
|
||
" <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"1100\">Störung</tspan>\n",
|
||
" <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"1100\">NOUN</tspan>\n",
|
||
"</text>\n",
|
||
"\n",
|
||
"<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"309.5\">\n",
|
||
" <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"1275\">im</tspan>\n",
|
||
" <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"1275\">ADP</tspan>\n",
|
||
"</text>\n",
|
||
"\n",
|
||
"<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"309.5\">\n",
|
||
" <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"1450\">Druckluftsystem</tspan>\n",
|
||
" <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"1450\">NOUN</tspan>\n",
|
||
"</text>\n",
|
||
"\n",
|
||
"<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"309.5\">\n",
|
||
" <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"1625\">defekt.</tspan>\n",
|
||
" <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"1625\">ADV</tspan>\n",
|
||
"</text>\n",
|
||
"\n",
|
||
"<g class=\"displacy-arrow\">\n",
|
||
" <path class=\"displacy-arc\" id=\"arrow-ef1bf216422b43369e5c4ba89cb854dd-0-0\" stroke-width=\"2px\" d=\"M70,264.5 C70,177.0 215.0,177.0 215.0,264.5\" fill=\"none\" stroke=\"currentColor\"/>\n",
|
||
" <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
|
||
" <textPath xlink:href=\"#arrow-ef1bf216422b43369e5c4ba89cb854dd-0-0\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">nk</textPath>\n",
|
||
" </text>\n",
|
||
" <path class=\"displacy-arrowhead\" d=\"M70,266.5 L62,254.5 78,254.5\" fill=\"currentColor\"/>\n",
|
||
"</g>\n",
|
||
"\n",
|
||
"<g class=\"displacy-arrow\">\n",
|
||
" <path class=\"displacy-arc\" id=\"arrow-ef1bf216422b43369e5c4ba89cb854dd-0-1\" stroke-width=\"2px\" d=\"M245,264.5 C245,89.5 570.0,89.5 570.0,264.5\" fill=\"none\" stroke=\"currentColor\"/>\n",
|
||
" <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
|
||
" <textPath xlink:href=\"#arrow-ef1bf216422b43369e5c4ba89cb854dd-0-1\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">sb</textPath>\n",
|
||
" </text>\n",
|
||
" <path class=\"displacy-arrowhead\" d=\"M245,266.5 L237,254.5 253,254.5\" fill=\"currentColor\"/>\n",
|
||
"</g>\n",
|
||
"\n",
|
||
"<g class=\"displacy-arrow\">\n",
|
||
" <path class=\"displacy-arc\" id=\"arrow-ef1bf216422b43369e5c4ba89cb854dd-0-2\" stroke-width=\"2px\" d=\"M245,264.5 C245,177.0 390.0,177.0 390.0,264.5\" fill=\"none\" stroke=\"currentColor\"/>\n",
|
||
" <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
|
||
" <textPath xlink:href=\"#arrow-ef1bf216422b43369e5c4ba89cb854dd-0-2\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">nk</textPath>\n",
|
||
" </text>\n",
|
||
" <path class=\"displacy-arrowhead\" d=\"M390.0,266.5 L398.0,254.5 382.0,254.5\" fill=\"currentColor\"/>\n",
|
||
"</g>\n",
|
||
"\n",
|
||
"<g class=\"displacy-arrow\">\n",
|
||
" <path class=\"displacy-arc\" id=\"arrow-ef1bf216422b43369e5c4ba89cb854dd-0-3\" stroke-width=\"2px\" d=\"M595,264.5 C595,177.0 740.0,177.0 740.0,264.5\" fill=\"none\" stroke=\"currentColor\"/>\n",
|
||
" <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
|
||
" <textPath xlink:href=\"#arrow-ef1bf216422b43369e5c4ba89cb854dd-0-3\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">mo</textPath>\n",
|
||
" </text>\n",
|
||
" <path class=\"displacy-arrowhead\" d=\"M740.0,266.5 L748.0,254.5 732.0,254.5\" fill=\"currentColor\"/>\n",
|
||
"</g>\n",
|
||
"\n",
|
||
"<g class=\"displacy-arrow\">\n",
|
||
" <path class=\"displacy-arc\" id=\"arrow-ef1bf216422b43369e5c4ba89cb854dd-0-4\" stroke-width=\"2px\" d=\"M945,264.5 C945,177.0 1090.0,177.0 1090.0,264.5\" fill=\"none\" stroke=\"currentColor\"/>\n",
|
||
" <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
|
||
" <textPath xlink:href=\"#arrow-ef1bf216422b43369e5c4ba89cb854dd-0-4\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">nk</textPath>\n",
|
||
" </text>\n",
|
||
" <path class=\"displacy-arrowhead\" d=\"M945,266.5 L937,254.5 953,254.5\" fill=\"currentColor\"/>\n",
|
||
"</g>\n",
|
||
"\n",
|
||
"<g class=\"displacy-arrow\">\n",
|
||
" <path class=\"displacy-arc\" id=\"arrow-ef1bf216422b43369e5c4ba89cb854dd-0-5\" stroke-width=\"2px\" d=\"M770,264.5 C770,89.5 1095.0,89.5 1095.0,264.5\" fill=\"none\" stroke=\"currentColor\"/>\n",
|
||
" <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
|
||
" <textPath xlink:href=\"#arrow-ef1bf216422b43369e5c4ba89cb854dd-0-5\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">nk</textPath>\n",
|
||
" </text>\n",
|
||
" <path class=\"displacy-arrowhead\" d=\"M1095.0,266.5 L1103.0,254.5 1087.0,254.5\" fill=\"currentColor\"/>\n",
|
||
"</g>\n",
|
||
"\n",
|
||
"<g class=\"displacy-arrow\">\n",
|
||
" <path class=\"displacy-arc\" id=\"arrow-ef1bf216422b43369e5c4ba89cb854dd-0-6\" stroke-width=\"2px\" d=\"M1120,264.5 C1120,177.0 1265.0,177.0 1265.0,264.5\" fill=\"none\" stroke=\"currentColor\"/>\n",
|
||
" <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
|
||
" <textPath xlink:href=\"#arrow-ef1bf216422b43369e5c4ba89cb854dd-0-6\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">mnr</textPath>\n",
|
||
" </text>\n",
|
||
" <path class=\"displacy-arrowhead\" d=\"M1265.0,266.5 L1273.0,254.5 1257.0,254.5\" fill=\"currentColor\"/>\n",
|
||
"</g>\n",
|
||
"\n",
|
||
"<g class=\"displacy-arrow\">\n",
|
||
" <path class=\"displacy-arc\" id=\"arrow-ef1bf216422b43369e5c4ba89cb854dd-0-7\" stroke-width=\"2px\" d=\"M1295,264.5 C1295,177.0 1440.0,177.0 1440.0,264.5\" fill=\"none\" stroke=\"currentColor\"/>\n",
|
||
" <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
|
||
" <textPath xlink:href=\"#arrow-ef1bf216422b43369e5c4ba89cb854dd-0-7\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">nk</textPath>\n",
|
||
" </text>\n",
|
||
" <path class=\"displacy-arrowhead\" d=\"M1440.0,266.5 L1448.0,254.5 1432.0,254.5\" fill=\"currentColor\"/>\n",
|
||
"</g>\n",
|
||
"\n",
|
||
"<g class=\"displacy-arrow\">\n",
|
||
" <path class=\"displacy-arc\" id=\"arrow-ef1bf216422b43369e5c4ba89cb854dd-0-8\" stroke-width=\"2px\" d=\"M595,264.5 C595,2.0 1625.0,2.0 1625.0,264.5\" fill=\"none\" stroke=\"currentColor\"/>\n",
|
||
" <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
|
||
" <textPath xlink:href=\"#arrow-ef1bf216422b43369e5c4ba89cb854dd-0-8\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">pd</textPath>\n",
|
||
" </text>\n",
|
||
" <path class=\"displacy-arrowhead\" d=\"M1625.0,266.5 L1633.0,254.5 1617.0,254.5\" fill=\"currentColor\"/>\n",
|
||
"</g>\n",
|
||
"</svg></span>"
|
||
],
|
||
"text/plain": [
|
||
"<IPython.core.display.HTML object>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"spacy.displacy.render(doc)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### Gesamter Datensatz"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 39,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# analysiere erste 10 Einträge\n",
|
||
"descr = temp1[['descr', 'num_occur']]\n",
|
||
"#descr = descr.iloc[50:200,:]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 40,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"#descr.iat[0,0] = 'Das ist ein Test am 24.08.2023'"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 41,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"6753"
|
||
]
|
||
},
|
||
"execution_count": 41,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"len(descr)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 42,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>descr</th>\n",
|
||
" <th>num_occur</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>161</th>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>92592</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>33</th>\n",
|
||
" <td>Wöchentliche Sichtkontrolle Reinigung</td>\n",
|
||
" <td>1654</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>130</th>\n",
|
||
" <td>Tägliche Überprüfung der Ölabscheider</td>\n",
|
||
" <td>1616</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>159</th>\n",
|
||
" <td>Wöchentliche Kontrolle der WC-Anlagen</td>\n",
|
||
" <td>1265</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>139</th>\n",
|
||
" <td>Halbjährliche Kontrolle des Stabbreithalters</td>\n",
|
||
" <td>687</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2665</th>\n",
|
||
" <td>Überprüfung der Y-Achse Schneidbrücke am LC 2 ...</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2664</th>\n",
|
||
" <td>Luftschlauch muss ausgetauscht werden. Ist und...</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2663</th>\n",
|
||
" <td>Riemenscheibe tauschen auf 650 UPM</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2660</th>\n",
|
||
" <td>Durchführung: Sollwert: 20 0,1g</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6752</th>\n",
|
||
" <td>Befestigung Deckel für Batteriefach defekt Hal...</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>6753 rows × 2 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" descr num_occur\n",
|
||
"161 Tägliche Wartungstätigkeiten nach Vorgabe des ... 92592\n",
|
||
"33 Wöchentliche Sichtkontrolle Reinigung 1654\n",
|
||
"130 Tägliche Überprüfung der Ölabscheider 1616\n",
|
||
"159 Wöchentliche Kontrolle der WC-Anlagen 1265\n",
|
||
"139 Halbjährliche Kontrolle des Stabbreithalters 687\n",
|
||
"... ... ...\n",
|
||
"2665 Überprüfung der Y-Achse Schneidbrücke am LC 2 ... 1\n",
|
||
"2664 Luftschlauch muss ausgetauscht werden. Ist und... 1\n",
|
||
"2663 Riemenscheibe tauschen auf 650 UPM 1\n",
|
||
"2660 Durchführung: Sollwert: 20 0,1g 1\n",
|
||
"6752 Befestigung Deckel für Batteriefach defekt Hal... 1\n",
|
||
"\n",
|
||
"[6753 rows x 2 columns]"
|
||
]
|
||
},
|
||
"execution_count": 42,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"descr"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 49,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"#LOAD_CALC_FILES = True\n",
|
||
"#LOAD_CALC_FILES = False\n",
|
||
"#IS_TEST = True\n",
|
||
"IS_TEST = False"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 44,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"spell_check_whitelist = {\n",
|
||
" '',\n",
|
||
" 'beschlag',\n",
|
||
" 'brandschutztechnische',\n",
|
||
" 'dichtung',\n",
|
||
" 'festhaltevorrichtung',\n",
|
||
" 'funktion',\n",
|
||
" 'halbjährliche',\n",
|
||
" 'kontrolle',\n",
|
||
" 'maschinenhersteller',\n",
|
||
" 'prüfung',\n",
|
||
" 'reinigung',\n",
|
||
" 'scharnier',\n",
|
||
" 'schließvorrichtung',\n",
|
||
" 'schmierung',\n",
|
||
" 'sichtkontrolle',\n",
|
||
" 'stabbreithalter',\n",
|
||
" 'technikrundgang',\n",
|
||
" 'vorgabe',\n",
|
||
" 'wartungstätigkeit',\n",
|
||
" 'wcanlage',\n",
|
||
" 'ölabscheider',\n",
|
||
" 'abarbeiten',\n",
|
||
" 'abgleichen',\n",
|
||
" 'abschmieren',\n",
|
||
" 'abschmierung',\n",
|
||
" 'abteilungsleiter',\n",
|
||
" 'akku',\n",
|
||
" 'analyse',\n",
|
||
" 'arbeitsplan',\n",
|
||
" 'aschenbecher',\n",
|
||
" 'auffüllen',\n",
|
||
" 'auflistung',\n",
|
||
" 'befestigungsschraube',\n",
|
||
" 'beschädigung',\n",
|
||
" 'betriebsstunde',\n",
|
||
" 'blombe',\n",
|
||
" 'blombieren',\n",
|
||
" 'brückner',\n",
|
||
" 'campenabwickler',\n",
|
||
" 'campenaufwickler',\n",
|
||
" 'desinfektionsmittel',\n",
|
||
" 'dichtigkeit',\n",
|
||
" 'druckkontrolle',\n",
|
||
" 'efficiosystem',\n",
|
||
" 'eigenverantwortlichkeit',\n",
|
||
" 'einrichtung',\n",
|
||
" 'email',\n",
|
||
" 'erledigungsdatum',\n",
|
||
" 'extradate',\n",
|
||
" 'extradatum',\n",
|
||
" 'filter',\n",
|
||
" 'firma',\n",
|
||
" 'formplatte',\n",
|
||
" 'frostprävention',\n",
|
||
" 'gegendruckbolze',\n",
|
||
" 'gesamtanlage',\n",
|
||
" 'heizungsanlage',\n",
|
||
" 'keller',\n",
|
||
" 'kesselhauskontrolle',\n",
|
||
" 'kesselwasser',\n",
|
||
" 'koffer',\n",
|
||
" 'kompensator',\n",
|
||
" 'kompressorstation',\n",
|
||
" 'kondensat',\n",
|
||
" 'kühlturm',\n",
|
||
" 'kühltürme',\n",
|
||
" 'lager',\n",
|
||
" 'laserabteilung',\n",
|
||
" 'leckage',\n",
|
||
" 'leerung',\n",
|
||
" 'leiterprüfung',\n",
|
||
" 'linearkugellager',\n",
|
||
" 'luftdruckkontrolle',\n",
|
||
" 'magazin',\n",
|
||
" 'maschinenbediener',\n",
|
||
" 'messwert',\n",
|
||
" 'monat',\n",
|
||
" 'motor',\n",
|
||
" 'papiermüllbehälter',\n",
|
||
" 'personalbüro',\n",
|
||
" 'pflasterschrank',\n",
|
||
" 'rieme',\n",
|
||
" 'rollenkette',\n",
|
||
" 'rundgang',\n",
|
||
" 'schweißkopf',\n",
|
||
" 'schweisskopf',\n",
|
||
" 'sichtprüfung',\n",
|
||
" 'speisewasser',\n",
|
||
" 'sprinkleranlage',\n",
|
||
" 'temperatursensor',\n",
|
||
" 'terminieren',\n",
|
||
" 'ticket',\n",
|
||
" 'trommel',\n",
|
||
" 'täglicher',\n",
|
||
" 'uvröhre',\n",
|
||
" 'ventilator',\n",
|
||
" 'verbandsmaterial',\n",
|
||
" 'verschleiß',\n",
|
||
" 'verschleiss',\n",
|
||
" 'vorbelegung',\n",
|
||
" 'wartung',\n",
|
||
" 'wartungsarbeit',\n",
|
||
" 'wartungsplan',\n",
|
||
" 'wasseraufbereitung',\n",
|
||
" 'wasseraufbereitungsanlage',\n",
|
||
" 'wasserverbrauch',\n",
|
||
" 'weberei',\n",
|
||
" 'wumagtrockner',\n",
|
||
" 'wäscherkontrolle',\n",
|
||
" 'wöchig',\n",
|
||
" 'abdichten',\n",
|
||
" 'abfluprüfung',\n",
|
||
" 'ablesen',\n",
|
||
" 'abluftkanal',\n",
|
||
" 'absauganlage',\n",
|
||
" 'abspeichern',\n",
|
||
" 'absprache',\n",
|
||
" 'aktivkohlepatron',\n",
|
||
" 'aktivkohlepatrone',\n",
|
||
" 'anbackung',\n",
|
||
" 'anfragen',\n",
|
||
" 'angebot',\n",
|
||
" 'anpresswalze',\n",
|
||
" 'ansaug',\n",
|
||
" 'anschluss',\n",
|
||
" 'anschluß',\n",
|
||
" 'anzahl',\n",
|
||
" 'auen',\n",
|
||
" 'auenbereich',\n",
|
||
" 'aueneinheit',\n",
|
||
" 'aufwickler',\n",
|
||
" 'ausblasöffnung',\n",
|
||
" 'ausbrennen',\n",
|
||
" 'auslassventil',\n",
|
||
" 'ausrüstung',\n",
|
||
" 'austausch',\n",
|
||
" 'axialpendelrollenlager',\n",
|
||
" 'batteriewechsel',\n",
|
||
" 'batterieüberprüfung',\n",
|
||
" 'baugruppe',\n",
|
||
" 'baumwolltuch',\n",
|
||
" 'bauteil',\n",
|
||
" 'befeuchter',\n",
|
||
" 'beleuchtung',\n",
|
||
" 'beschichtunglegierung',\n",
|
||
" 'besprechungszimmer',\n",
|
||
" 'bestandskontrolle',\n",
|
||
" 'bestellformular',\n",
|
||
" 'bestätigung',\n",
|
||
" 'bezeichnung',\n",
|
||
" 'binder',\n",
|
||
" 'blutstop',\n",
|
||
" 'bolze',\n",
|
||
" 'breitstreckwalze',\n",
|
||
" 'containerstellfläche',\n",
|
||
" 'contrawalze',\n",
|
||
" 'dachfläche',\n",
|
||
" 'dampfzylinder',\n",
|
||
" 'deformierung',\n",
|
||
" 'dezember',\n",
|
||
" 'din',\n",
|
||
" 'docke',\n",
|
||
" 'dokumentation',\n",
|
||
" 'dosierpumpe',\n",
|
||
" 'druckluftbehälter',\n",
|
||
" 'druckluftleitung',\n",
|
||
" 'druckluftschläuche',\n",
|
||
" 'drucktestkontrolle',\n",
|
||
" 'einterminieren',\n",
|
||
" 'eintragung',\n",
|
||
" 'einzelprotokoll',\n",
|
||
" 'einziehwalze',\n",
|
||
" 'elektisch',\n",
|
||
" 'element',\n",
|
||
" 'enthärtung',\n",
|
||
" 'entwässern',\n",
|
||
" 'erledigungsbeschreibeung',\n",
|
||
" 'erstehilfeeinrichtung',\n",
|
||
" 'erweiterung',\n",
|
||
" 'explosionsschutzanlage',\n",
|
||
" 'extradaten',\n",
|
||
" 'exzenterringbefestigung',\n",
|
||
" 'fa',\n",
|
||
" 'fach',\n",
|
||
" 'faltenbalge',\n",
|
||
" 'feedbackinput',\n",
|
||
" 'feuerwehrumfahrung',\n",
|
||
" 'filert',\n",
|
||
" 'filteranlage',\n",
|
||
" 'filterelement',\n",
|
||
" 'filterstufe',\n",
|
||
" 'fixtermin',\n",
|
||
" 'flanschlager',\n",
|
||
" 'flanschlagerquadrat',\n",
|
||
" 'fluchtwegsymbol',\n",
|
||
" 'flusenabsaugrohr',\n",
|
||
" 'freilauf',\n",
|
||
" 'fremdkörper',\n",
|
||
" 'führungswagen',\n",
|
||
" 'gaslager',\n",
|
||
" 'gaszählerstand',\n",
|
||
" 'gatter',\n",
|
||
" 'geräteinner',\n",
|
||
" 'geräteinneres',\n",
|
||
" 'geräusch',\n",
|
||
" 'gesamt',\n",
|
||
" 'gesamterzeugt',\n",
|
||
" 'getränkeautomat',\n",
|
||
" 'gewindebefestigung',\n",
|
||
" 'gewindestiftbefestigung',\n",
|
||
" 'gleitschiene',\n",
|
||
" 'grat',\n",
|
||
" 'gro',\n",
|
||
" 'grundplatte',\n",
|
||
" 'halle',\n",
|
||
" 'haupteingang',\n",
|
||
" 'hebebühne',\n",
|
||
" 'hebezeug',\n",
|
||
" 'helm',\n",
|
||
" 'hersteller',\n",
|
||
" 'hochregal',\n",
|
||
" 'hochtemperatur',\n",
|
||
" 'hochtemperatureinsatz',\n",
|
||
" 'hydraulik',\n",
|
||
" 'hydrauliköl',\n",
|
||
" 'impulseingang',\n",
|
||
" 'indikator',\n",
|
||
" 'inneneinheit',\n",
|
||
" 'insektenvernichter',\n",
|
||
" 'kabel',\n",
|
||
" 'kammer',\n",
|
||
" 'karton',\n",
|
||
" 'kegelradgetriebe',\n",
|
||
" 'kegelradgetriebemotor',\n",
|
||
" 'kette',\n",
|
||
" 'klemmrolle',\n",
|
||
" 'klimaanlage',\n",
|
||
" 'klimabühne',\n",
|
||
" 'klimagerät',\n",
|
||
" 'kompressor',\n",
|
||
" 'kompressorluftwert',\n",
|
||
" 'kontoll',\n",
|
||
" 'kontrawalze',\n",
|
||
" 'kontroll',\n",
|
||
" 'krankheit',\n",
|
||
" 'krän',\n",
|
||
" 'kräne',\n",
|
||
" 'kuehlaggregat',\n",
|
||
" 'kw',\n",
|
||
" 'kühlgerät',\n",
|
||
" 'lagereinheit',\n",
|
||
" 'lagereinsatz',\n",
|
||
" 'lagerort',\n",
|
||
" 'lagerung',\n",
|
||
" 'laser',\n",
|
||
" 'laufgeräusche',\n",
|
||
" 'luftansaugseite',\n",
|
||
" 'luftfilter',\n",
|
||
" 'luftfilterwasserabscheider',\n",
|
||
" 'luftmenge',\n",
|
||
" 'luftreiniger',\n",
|
||
" 'lösungsmittel',\n",
|
||
" 'lüftungsanlage',\n",
|
||
" 'macke',\n",
|
||
" 'managementsystem',\n",
|
||
" 'maschinenanschluss',\n",
|
||
" 'materialzersetzung',\n",
|
||
" 'messlager',\n",
|
||
" 'micron',\n",
|
||
" 'mischer',\n",
|
||
" 'monatlicher',\n",
|
||
" 'monatliches',\n",
|
||
" 'monteur',\n",
|
||
" 'moos',\n",
|
||
" 'motorstart',\n",
|
||
" 'nachfetten',\n",
|
||
" 'nachschmieren',\n",
|
||
" 'nachspann',\n",
|
||
" 'neuvertrag',\n",
|
||
" 'nord',\n",
|
||
" 'nottelefon',\n",
|
||
" 'nr',\n",
|
||
" 'oberer',\n",
|
||
" 'oberflächenkontrolle',\n",
|
||
" 'objektkarte',\n",
|
||
" 'palette',\n",
|
||
" 'pendelkugellager',\n",
|
||
" 'pfeifer',\n",
|
||
" 'platine',\n",
|
||
" 'pneum',\n",
|
||
" 'pneumatikventil',\n",
|
||
" 'pneumatisch',\n",
|
||
" 'pos',\n",
|
||
" 'positioniersystem',\n",
|
||
" 'prozesskennzahl',\n",
|
||
" 'prüfbericht',\n",
|
||
" 'prüfplan',\n",
|
||
" 'rampenbereich',\n",
|
||
" 'rauwalze',\n",
|
||
" 'regalprüfer',\n",
|
||
" 'regalsicherungsanlage',\n",
|
||
" 'reiniger',\n",
|
||
" 'reinigungstuch',\n",
|
||
" 'restlich',\n",
|
||
" 'risikoersatzteil',\n",
|
||
" 'rohrtrenner',\n",
|
||
" 'roller',\n",
|
||
" 'rundgangkontrollen',\n",
|
||
" 'rückmeldung',\n",
|
||
" 'sae',\n",
|
||
" 'sauberkeit',\n",
|
||
" 'schlitten',\n",
|
||
" 'schmierstoff',\n",
|
||
" 'schmierstoffmenge',\n",
|
||
" 'schneider',\n",
|
||
" 'schraube',\n",
|
||
" 'schraubenbestand',\n",
|
||
" 'schutzabdeckung',\n",
|
||
" 'sicherheitsbeleuchtung',\n",
|
||
" 'sicherheitseinrichtung',\n",
|
||
" 'sicherheitslichtschranke',\n",
|
||
" 'sicherheitsweste',\n",
|
||
" 'sicherstellung',\n",
|
||
" 'sonotrode',\n",
|
||
" 'sonotrodenständer',\n",
|
||
" 'spannkopflager',\n",
|
||
" 'spannlager',\n",
|
||
" 'spannrahmen',\n",
|
||
" 'spindel',\n",
|
||
" 'spindelhubgetriebe',\n",
|
||
" 'spindelmutter',\n",
|
||
" 'spülzeitprüfung',\n",
|
||
" 'stab',\n",
|
||
" 'stadtwasser',\n",
|
||
" 'stehlager',\n",
|
||
" 'stehlagergehäuse',\n",
|
||
" 'steuerung',\n",
|
||
" 'stückliste',\n",
|
||
" 'systemumstellung',\n",
|
||
" 'telefonanlage',\n",
|
||
" 'telefonat',\n",
|
||
" 'termin',\n",
|
||
" 'terminabsprache',\n",
|
||
" 'terminiern',\n",
|
||
" 'terminiert',\n",
|
||
" 'terminierung',\n",
|
||
" 'terminvorschlag',\n",
|
||
" 'testomat',\n",
|
||
" 'thermoheizelement',\n",
|
||
" 'torsprechanlage',\n",
|
||
" 'trinkwassernetz',\n",
|
||
" 'trockenzylinder',\n",
|
||
" 'tänzerrolle',\n",
|
||
" 'türdichtung',\n",
|
||
" 'türgriff',\n",
|
||
" 'türsicherung',\n",
|
||
" 'umlenkwalzen',\n",
|
||
" 'umrandung',\n",
|
||
" 'unkraut',\n",
|
||
" 'uschienenführung',\n",
|
||
" 'uvv',\n",
|
||
" 'ventil',\n",
|
||
" 'verbaut',\n",
|
||
" 'verbrennungsset',\n",
|
||
" 'vereinbarung',\n",
|
||
" 'verkalkung',\n",
|
||
" 'verschleiteileinsatz',\n",
|
||
" 'verschmutzung',\n",
|
||
" 'verschmutzungenlos',\n",
|
||
" 'verstellung',\n",
|
||
" 'verunreinigung',\n",
|
||
" 'vollständigkeit',\n",
|
||
" 'volumenzähler',\n",
|
||
" 'vorderer',\n",
|
||
" 'vordruck',\n",
|
||
" 'vorfilter',\n",
|
||
" 'vorfilterflie',\n",
|
||
" 'vorliegen',\n",
|
||
" 'vormonat',\n",
|
||
" 'wartungsintervall',\n",
|
||
" 'wartungsvertrag',\n",
|
||
" 'wasserfilter',\n",
|
||
" 'wasserhärte',\n",
|
||
" 'wasserpegelkontrolle',\n",
|
||
" 'wasserzählerstand',\n",
|
||
" 'wechselintervall',\n",
|
||
" 'wärmetauscher',\n",
|
||
" 'zahnrieme',\n",
|
||
" 'zahnstange',\n",
|
||
" 'zuleitung',\n",
|
||
" 'zuschicken',\n",
|
||
" 'ölfüllung',\n",
|
||
" 'ölstand',\n",
|
||
" 'ölstandsichtprüfung',\n",
|
||
" 'ölstandskontrolle',\n",
|
||
" 'überziehen'\n",
|
||
"}"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 45,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"corrections: dict[str, str] = {\n",
|
||
" 'desifektionsmittel': 'desinfektionsmittel',\n",
|
||
" 'schweikopf': 'schweisskopf',\n",
|
||
"}"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 46,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"INFO:base:Number of entries processed: 1, Percent completed: 0.01\n"
|
||
]
|
||
},
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"INFO:base:Number of entries processed: 501, Percent completed: 7.42\n",
|
||
"INFO:base:Number of entries processed: 1001, Percent completed: 14.82\n",
|
||
"INFO:base:Number of entries processed: 1501, Percent completed: 22.23\n",
|
||
"INFO:base:Number of entries processed: 2001, Percent completed: 29.63\n",
|
||
"INFO:base:Number of entries processed: 2501, Percent completed: 37.04\n",
|
||
"INFO:base:Number of entries processed: 3001, Percent completed: 44.44\n",
|
||
"INFO:base:Number of entries processed: 3501, Percent completed: 51.84\n",
|
||
"INFO:base:Number of entries processed: 4001, Percent completed: 59.25\n",
|
||
"INFO:base:Number of entries processed: 4501, Percent completed: 66.65\n",
|
||
"INFO:base:Number of entries processed: 5001, Percent completed: 74.06\n",
|
||
"INFO:base:Number of entries processed: 5501, Percent completed: 81.46\n",
|
||
"INFO:base:Number of entries processed: 6001, Percent completed: 88.86\n",
|
||
"INFO:base:Number of entries processed: 6501, Percent completed: 96.27\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# adjacency matrix\n",
|
||
"connections = dict()\n",
|
||
"unique_tokens = set()\n",
|
||
"UPDATE_STATUS = 500\n",
|
||
"length_data = len(descr)\n",
|
||
"spell_check_candidates = set()\n",
|
||
"spell_checker = SpellChecker(language='de', distance=1)\n",
|
||
"\n",
|
||
"if not LOAD_CALC_FILES or IS_TEST:\n",
|
||
" for count, description in enumerate(descr.iterrows()):\n",
|
||
" \n",
|
||
" text = description[1]['descr']\n",
|
||
" weight = description[1]['num_occur']\n",
|
||
" \n",
|
||
" doc = nlp(text)\n",
|
||
" \n",
|
||
" obtain_descendant_info(\n",
|
||
" doc=doc,\n",
|
||
" weight=weight,\n",
|
||
" POS_of_interest=POS_of_interest,\n",
|
||
" TAG_of_interest=TAG_of_interest,\n",
|
||
" connections=connections,\n",
|
||
" unique_tokens=unique_tokens,\n",
|
||
" spell_check_candidates=spell_check_candidates,\n",
|
||
" spell_check_whitelist=spell_check_whitelist,\n",
|
||
" spell_checker=spell_checker,\n",
|
||
" corrections=corrections,\n",
|
||
" )\n",
|
||
" \n",
|
||
" if count % UPDATE_STATUS == 0:\n",
|
||
" logger.info(f'Number of entries processed: {count+1}, Percent completed: {((count+1) / length_data) * 100:.2f}')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 50,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"ADJ_DF_PATH = './Graphanalyse/adj_mat_df.fth'\n",
|
||
"if not IS_TEST:\n",
|
||
" if LOAD_CALC_FILES:\n",
|
||
" adj_mat_undir = pd.read_feather(ADJ_DF_PATH)\n",
|
||
" adj_mat_undir = adj_mat_undir.set_index('index')\n",
|
||
" # additional information\n",
|
||
" connections = load_pickle('connections.pkl')\n",
|
||
" unique_tokens = load_pickle('unique_tokens.pkl')\n",
|
||
" else:\n",
|
||
" adj_mat = obtain_adj_matrix(unique_tokens=unique_tokens, connections=connections)\n",
|
||
" adj_mat_undir = make_undir_adj_matrix(adj_mat=adj_mat)\n",
|
||
" save_df = adj_mat_undir.reset_index()\n",
|
||
" save_df.to_feather(ADJ_DF_PATH)\n",
|
||
" # additional information\n",
|
||
" save_pickle(obj=connections, path='connections.pkl')\n",
|
||
" save_pickle(obj=unique_tokens, path='unique_tokens.pkl')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 51,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>funktionsfähig</th>\n",
|
||
" <th>Rechter</th>\n",
|
||
" <th>Laserteller</th>\n",
|
||
" <th>vorbereiten</th>\n",
|
||
" <th>weiterer</th>\n",
|
||
" <th>Ausbau</th>\n",
|
||
" <th>Travers</th>\n",
|
||
" <th>Funktionsbereitschaft</th>\n",
|
||
" <th>umwandeln</th>\n",
|
||
" <th>Hechtanlage</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>Filterpumpe</th>\n",
|
||
" <th>entwickeln</th>\n",
|
||
" <th>Pumpenstab</th>\n",
|
||
" <th>Hauptrade</th>\n",
|
||
" <th>anlernen</th>\n",
|
||
" <th>Begutachtung</th>\n",
|
||
" <th>Betriebszeit</th>\n",
|
||
" <th>Wassereinbruch</th>\n",
|
||
" <th>Antriebszahnrad</th>\n",
|
||
" <th>Prostataproblem</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>-Austausch</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>-Befestihgung</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>-Bereich</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>-Betonblock-</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>-Bremskombination</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>überziechen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>überziehen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>überziehn</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>üblich</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>üperprüfen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>7046 rows × 7046 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" funktionsfähig Rechter Laserteller vorbereiten \\\n",
|
||
"-Austausch 0 0 0 0 \n",
|
||
"-Befestihgung 0 0 0 0 \n",
|
||
"-Bereich 0 0 0 0 \n",
|
||
"-Betonblock- 0 0 0 0 \n",
|
||
"-Bremskombination 0 0 0 0 \n",
|
||
"... ... ... ... ... \n",
|
||
"überziechen 0 0 0 0 \n",
|
||
"überziehen 0 0 0 0 \n",
|
||
"überziehn 0 0 0 0 \n",
|
||
"üblich 0 0 0 0 \n",
|
||
"üperprüfen 0 0 0 0 \n",
|
||
"\n",
|
||
" weiterer Ausbau Travers Funktionsbereitschaft \\\n",
|
||
"-Austausch 0 0 0 0 \n",
|
||
"-Befestihgung 0 0 0 0 \n",
|
||
"-Bereich 0 0 0 0 \n",
|
||
"-Betonblock- 0 0 0 0 \n",
|
||
"-Bremskombination 0 0 0 0 \n",
|
||
"... ... ... ... ... \n",
|
||
"überziechen 0 0 0 0 \n",
|
||
"überziehen 0 0 0 0 \n",
|
||
"überziehn 0 0 0 0 \n",
|
||
"üblich 0 0 0 0 \n",
|
||
"üperprüfen 0 0 0 0 \n",
|
||
"\n",
|
||
" umwandeln Hechtanlage ... Filterpumpe entwickeln \\\n",
|
||
"-Austausch 0 0 ... 0 0 \n",
|
||
"-Befestihgung 0 0 ... 0 0 \n",
|
||
"-Bereich 0 0 ... 0 0 \n",
|
||
"-Betonblock- 0 0 ... 0 0 \n",
|
||
"-Bremskombination 0 0 ... 0 0 \n",
|
||
"... ... ... ... ... ... \n",
|
||
"überziechen 0 0 ... 0 0 \n",
|
||
"überziehen 0 0 ... 0 0 \n",
|
||
"überziehn 0 0 ... 0 0 \n",
|
||
"üblich 0 0 ... 0 0 \n",
|
||
"üperprüfen 0 0 ... 0 0 \n",
|
||
"\n",
|
||
" Pumpenstab Hauptrade anlernen Begutachtung \\\n",
|
||
"-Austausch 0 0 0 0 \n",
|
||
"-Befestihgung 0 0 0 0 \n",
|
||
"-Bereich 0 0 0 0 \n",
|
||
"-Betonblock- 0 0 0 0 \n",
|
||
"-Bremskombination 0 0 0 0 \n",
|
||
"... ... ... ... ... \n",
|
||
"überziechen 0 0 0 0 \n",
|
||
"überziehen 0 0 0 0 \n",
|
||
"überziehn 0 0 0 0 \n",
|
||
"üblich 0 0 0 0 \n",
|
||
"üperprüfen 0 0 0 0 \n",
|
||
"\n",
|
||
" Betriebszeit Wassereinbruch Antriebszahnrad \\\n",
|
||
"-Austausch 0 0 0 \n",
|
||
"-Befestihgung 0 0 0 \n",
|
||
"-Bereich 0 0 0 \n",
|
||
"-Betonblock- 0 0 0 \n",
|
||
"-Bremskombination 0 0 0 \n",
|
||
"... ... ... ... \n",
|
||
"überziechen 0 0 0 \n",
|
||
"überziehen 0 0 0 \n",
|
||
"überziehn 0 0 0 \n",
|
||
"üblich 0 0 0 \n",
|
||
"üperprüfen 0 0 0 \n",
|
||
"\n",
|
||
" Prostataproblem \n",
|
||
"-Austausch 0 \n",
|
||
"-Befestihgung 0 \n",
|
||
"-Bereich 0 \n",
|
||
"-Betonblock- 0 \n",
|
||
"-Bremskombination 0 \n",
|
||
"... ... \n",
|
||
"überziechen 0 \n",
|
||
"überziehen 0 \n",
|
||
"überziehn 0 \n",
|
||
"üblich 0 \n",
|
||
"üperprüfen 0 \n",
|
||
"\n",
|
||
"[7046 rows x 7046 columns]"
|
||
]
|
||
},
|
||
"execution_count": 51,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"adj_mat_undir.sort_index()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 52,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>funktionsfähig</th>\n",
|
||
" <th>Rechter</th>\n",
|
||
" <th>Laserteller</th>\n",
|
||
" <th>vorbereiten</th>\n",
|
||
" <th>weiterer</th>\n",
|
||
" <th>Ausbau</th>\n",
|
||
" <th>Travers</th>\n",
|
||
" <th>Funktionsbereitschaft</th>\n",
|
||
" <th>umwandeln</th>\n",
|
||
" <th>Hechtanlage</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>Filterpumpe</th>\n",
|
||
" <th>entwickeln</th>\n",
|
||
" <th>Pumpenstab</th>\n",
|
||
" <th>Hauptrade</th>\n",
|
||
" <th>anlernen</th>\n",
|
||
" <th>Begutachtung</th>\n",
|
||
" <th>Betriebszeit</th>\n",
|
||
" <th>Wassereinbruch</th>\n",
|
||
" <th>Antriebszahnrad</th>\n",
|
||
" <th>Prostataproblem</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>-Austausch</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>-Befestihgung</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>-Bereich</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>-Betonblock-</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>-Bremskombination</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>überziechen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>überziehen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>überziehn</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>üblich</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>üperprüfen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>7046 rows × 7046 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" funktionsfähig Rechter Laserteller vorbereiten \\\n",
|
||
"-Austausch 0 0 0 0 \n",
|
||
"-Befestihgung 0 0 0 0 \n",
|
||
"-Bereich 0 0 0 0 \n",
|
||
"-Betonblock- 0 0 0 0 \n",
|
||
"-Bremskombination 0 0 0 0 \n",
|
||
"... ... ... ... ... \n",
|
||
"überziechen 0 0 0 0 \n",
|
||
"überziehen 0 0 0 0 \n",
|
||
"überziehn 0 0 0 0 \n",
|
||
"üblich 0 0 0 0 \n",
|
||
"üperprüfen 0 0 0 0 \n",
|
||
"\n",
|
||
" weiterer Ausbau Travers Funktionsbereitschaft \\\n",
|
||
"-Austausch 0 0 0 0 \n",
|
||
"-Befestihgung 0 0 0 0 \n",
|
||
"-Bereich 0 0 0 0 \n",
|
||
"-Betonblock- 0 0 0 0 \n",
|
||
"-Bremskombination 0 0 0 0 \n",
|
||
"... ... ... ... ... \n",
|
||
"überziechen 0 0 0 0 \n",
|
||
"überziehen 0 0 0 0 \n",
|
||
"überziehn 0 0 0 0 \n",
|
||
"üblich 0 0 0 0 \n",
|
||
"üperprüfen 0 0 0 0 \n",
|
||
"\n",
|
||
" umwandeln Hechtanlage ... Filterpumpe entwickeln \\\n",
|
||
"-Austausch 0 0 ... 0 0 \n",
|
||
"-Befestihgung 0 0 ... 0 0 \n",
|
||
"-Bereich 0 0 ... 0 0 \n",
|
||
"-Betonblock- 0 0 ... 0 0 \n",
|
||
"-Bremskombination 0 0 ... 0 0 \n",
|
||
"... ... ... ... ... ... \n",
|
||
"überziechen 0 0 ... 0 0 \n",
|
||
"überziehen 0 0 ... 0 0 \n",
|
||
"überziehn 0 0 ... 0 0 \n",
|
||
"üblich 0 0 ... 0 0 \n",
|
||
"üperprüfen 0 0 ... 0 0 \n",
|
||
"\n",
|
||
" Pumpenstab Hauptrade anlernen Begutachtung \\\n",
|
||
"-Austausch 0 0 0 0 \n",
|
||
"-Befestihgung 0 0 0 0 \n",
|
||
"-Bereich 0 0 0 0 \n",
|
||
"-Betonblock- 0 0 0 0 \n",
|
||
"-Bremskombination 0 0 0 0 \n",
|
||
"... ... ... ... ... \n",
|
||
"überziechen 0 0 0 0 \n",
|
||
"überziehen 0 0 0 0 \n",
|
||
"überziehn 0 0 0 0 \n",
|
||
"üblich 0 0 0 0 \n",
|
||
"üperprüfen 0 0 0 0 \n",
|
||
"\n",
|
||
" Betriebszeit Wassereinbruch Antriebszahnrad \\\n",
|
||
"-Austausch 0 0 0 \n",
|
||
"-Befestihgung 0 0 0 \n",
|
||
"-Bereich 0 0 0 \n",
|
||
"-Betonblock- 0 0 0 \n",
|
||
"-Bremskombination 0 0 0 \n",
|
||
"... ... ... ... \n",
|
||
"überziechen 0 0 0 \n",
|
||
"überziehen 0 0 0 \n",
|
||
"überziehn 0 0 0 \n",
|
||
"üblich 0 0 0 \n",
|
||
"üperprüfen 0 0 0 \n",
|
||
"\n",
|
||
" Prostataproblem \n",
|
||
"-Austausch 0 \n",
|
||
"-Befestihgung 0 \n",
|
||
"-Bereich 0 \n",
|
||
"-Betonblock- 0 \n",
|
||
"-Bremskombination 0 \n",
|
||
"... ... \n",
|
||
"überziechen 0 \n",
|
||
"überziehen 0 \n",
|
||
"überziehn 0 \n",
|
||
"üblich 0 \n",
|
||
"üperprüfen 0 \n",
|
||
"\n",
|
||
"[7046 rows x 7046 columns]"
|
||
]
|
||
},
|
||
"execution_count": 52,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"adj_mat_undir.sort_index()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>koennen</th>\n",
|
||
" <th>Weiterleitung</th>\n",
|
||
" <th>Brand</th>\n",
|
||
" <th>ein</th>\n",
|
||
" <th>Geräteinneres</th>\n",
|
||
" <th>Schmerz</th>\n",
|
||
" <th>Monat</th>\n",
|
||
" <th>Kontrawalzenbelag</th>\n",
|
||
" <th>Funktionstest</th>\n",
|
||
" <th>Kesselwasser</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>Niveau</th>\n",
|
||
" <th>Sprinkleranlage</th>\n",
|
||
" <th>Abdeckglas</th>\n",
|
||
" <th>Stoptast</th>\n",
|
||
" <th>ORing</th>\n",
|
||
" <th>ausblasen</th>\n",
|
||
" <th>absprechen</th>\n",
|
||
" <th>Artikelnummer</th>\n",
|
||
" <th>Fehlersichtung</th>\n",
|
||
" <th>brannen</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>koennen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Weiterleitung</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Brand</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>ein</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Geräteinneres</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>ausblasen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>absprechen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Artikelnummer</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Fehlersichtung</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>brannen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>6959 rows × 6959 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" koennen Weiterleitung Brand ein Geräteinneres Schmerz \\\n",
|
||
"koennen 0 0 0 0 0 0 \n",
|
||
"Weiterleitung 0 0 0 0 0 0 \n",
|
||
"Brand 0 0 0 0 0 0 \n",
|
||
"ein 0 0 0 0 0 0 \n",
|
||
"Geräteinneres 0 0 0 0 0 0 \n",
|
||
"... ... ... ... ... ... ... \n",
|
||
"ausblasen 0 0 0 0 0 0 \n",
|
||
"absprechen 0 0 0 0 0 0 \n",
|
||
"Artikelnummer 0 0 0 0 0 0 \n",
|
||
"Fehlersichtung 0 0 0 0 0 0 \n",
|
||
"brannen 0 0 0 0 0 0 \n",
|
||
"\n",
|
||
" Monat Kontrawalzenbelag Funktionstest Kesselwasser ... \\\n",
|
||
"koennen 0 0 0 0 ... \n",
|
||
"Weiterleitung 0 0 0 0 ... \n",
|
||
"Brand 0 0 0 0 ... \n",
|
||
"ein 0 0 0 0 ... \n",
|
||
"Geräteinneres 0 0 0 0 ... \n",
|
||
"... ... ... ... ... ... \n",
|
||
"ausblasen 0 0 0 0 ... \n",
|
||
"absprechen 0 0 0 0 ... \n",
|
||
"Artikelnummer 0 0 0 0 ... \n",
|
||
"Fehlersichtung 0 0 0 0 ... \n",
|
||
"brannen 0 0 0 0 ... \n",
|
||
"\n",
|
||
" Niveau Sprinkleranlage Abdeckglas Stoptast ORing \\\n",
|
||
"koennen 0 0 0 0 0 \n",
|
||
"Weiterleitung 0 0 0 0 0 \n",
|
||
"Brand 0 0 0 0 0 \n",
|
||
"ein 0 0 0 0 0 \n",
|
||
"Geräteinneres 0 0 0 0 0 \n",
|
||
"... ... ... ... ... ... \n",
|
||
"ausblasen 0 0 0 0 0 \n",
|
||
"absprechen 0 0 0 0 0 \n",
|
||
"Artikelnummer 0 0 0 0 0 \n",
|
||
"Fehlersichtung 0 0 0 0 0 \n",
|
||
"brannen 0 0 0 0 0 \n",
|
||
"\n",
|
||
" ausblasen absprechen Artikelnummer Fehlersichtung brannen \n",
|
||
"koennen 0 0 0 0 0 \n",
|
||
"Weiterleitung 0 0 0 0 0 \n",
|
||
"Brand 0 0 0 0 0 \n",
|
||
"ein 0 0 0 0 0 \n",
|
||
"Geräteinneres 0 0 0 0 0 \n",
|
||
"... ... ... ... ... ... \n",
|
||
"ausblasen 0 0 0 0 0 \n",
|
||
"absprechen 0 0 0 0 0 \n",
|
||
"Artikelnummer 0 0 0 0 0 \n",
|
||
"Fehlersichtung 0 0 0 0 0 \n",
|
||
"brannen 0 0 0 0 0 \n",
|
||
"\n",
|
||
"[6959 rows x 6959 columns]"
|
||
]
|
||
},
|
||
"execution_count": 776,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"adj_mat_undir"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"arr = adj_mat_undir.to_numpy()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"24490"
|
||
]
|
||
},
|
||
"execution_count": 54,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"np.count_nonzero(arr)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"92882"
|
||
]
|
||
},
|
||
"execution_count": 55,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"np.max(arr)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"195"
|
||
]
|
||
},
|
||
"execution_count": 56,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"uni_arr = np.unique(arr)\n",
|
||
"len(uni_arr)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Threshold"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"WEIGHT_THRESHOLD = 50"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"arr = adj_mat_undir.to_numpy()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"arr = np.where(arr < WEIGHT_THRESHOLD, 0, arr)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"387"
|
||
]
|
||
},
|
||
"execution_count": 788,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"np.count_nonzero(arr)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"177"
|
||
]
|
||
},
|
||
"execution_count": 789,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp = np.sum(arr, axis=0)\n",
|
||
"np.count_nonzero(temp)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"thresh_adj_mat = adj_mat_undir.copy()\n",
|
||
"thresh_adj_mat.loc[:] = arr"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>koennen</th>\n",
|
||
" <th>Weiterleitung</th>\n",
|
||
" <th>Brand</th>\n",
|
||
" <th>ein</th>\n",
|
||
" <th>Geräteinneres</th>\n",
|
||
" <th>Schmerz</th>\n",
|
||
" <th>Monat</th>\n",
|
||
" <th>Kontrawalzenbelag</th>\n",
|
||
" <th>Funktionstest</th>\n",
|
||
" <th>Kesselwasser</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>Niveau</th>\n",
|
||
" <th>Sprinkleranlage</th>\n",
|
||
" <th>Abdeckglas</th>\n",
|
||
" <th>Stoptast</th>\n",
|
||
" <th>ORing</th>\n",
|
||
" <th>ausblasen</th>\n",
|
||
" <th>absprechen</th>\n",
|
||
" <th>Artikelnummer</th>\n",
|
||
" <th>Fehlersichtung</th>\n",
|
||
" <th>brannen</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>koennen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Weiterleitung</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Brand</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>ein</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Geräteinneres</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>ausblasen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>absprechen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Artikelnummer</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Fehlersichtung</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>brannen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>6959 rows × 6959 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" koennen Weiterleitung Brand ein Geräteinneres Schmerz \\\n",
|
||
"koennen 0 0 0 0 0 0 \n",
|
||
"Weiterleitung 0 0 0 0 0 0 \n",
|
||
"Brand 0 0 0 0 0 0 \n",
|
||
"ein 0 0 0 0 0 0 \n",
|
||
"Geräteinneres 0 0 0 0 0 0 \n",
|
||
"... ... ... ... ... ... ... \n",
|
||
"ausblasen 0 0 0 0 0 0 \n",
|
||
"absprechen 0 0 0 0 0 0 \n",
|
||
"Artikelnummer 0 0 0 0 0 0 \n",
|
||
"Fehlersichtung 0 0 0 0 0 0 \n",
|
||
"brannen 0 0 0 0 0 0 \n",
|
||
"\n",
|
||
" Monat Kontrawalzenbelag Funktionstest Kesselwasser ... \\\n",
|
||
"koennen 0 0 0 0 ... \n",
|
||
"Weiterleitung 0 0 0 0 ... \n",
|
||
"Brand 0 0 0 0 ... \n",
|
||
"ein 0 0 0 0 ... \n",
|
||
"Geräteinneres 0 0 0 0 ... \n",
|
||
"... ... ... ... ... ... \n",
|
||
"ausblasen 0 0 0 0 ... \n",
|
||
"absprechen 0 0 0 0 ... \n",
|
||
"Artikelnummer 0 0 0 0 ... \n",
|
||
"Fehlersichtung 0 0 0 0 ... \n",
|
||
"brannen 0 0 0 0 ... \n",
|
||
"\n",
|
||
" Niveau Sprinkleranlage Abdeckglas Stoptast ORing \\\n",
|
||
"koennen 0 0 0 0 0 \n",
|
||
"Weiterleitung 0 0 0 0 0 \n",
|
||
"Brand 0 0 0 0 0 \n",
|
||
"ein 0 0 0 0 0 \n",
|
||
"Geräteinneres 0 0 0 0 0 \n",
|
||
"... ... ... ... ... ... \n",
|
||
"ausblasen 0 0 0 0 0 \n",
|
||
"absprechen 0 0 0 0 0 \n",
|
||
"Artikelnummer 0 0 0 0 0 \n",
|
||
"Fehlersichtung 0 0 0 0 0 \n",
|
||
"brannen 0 0 0 0 0 \n",
|
||
"\n",
|
||
" ausblasen absprechen Artikelnummer Fehlersichtung brannen \n",
|
||
"koennen 0 0 0 0 0 \n",
|
||
"Weiterleitung 0 0 0 0 0 \n",
|
||
"Brand 0 0 0 0 0 \n",
|
||
"ein 0 0 0 0 0 \n",
|
||
"Geräteinneres 0 0 0 0 0 \n",
|
||
"... ... ... ... ... ... \n",
|
||
"ausblasen 0 0 0 0 0 \n",
|
||
"absprechen 0 0 0 0 0 \n",
|
||
"Artikelnummer 0 0 0 0 0 \n",
|
||
"Fehlersichtung 0 0 0 0 0 \n",
|
||
"brannen 0 0 0 0 0 \n",
|
||
"\n",
|
||
"[6959 rows x 6959 columns]"
|
||
]
|
||
},
|
||
"execution_count": 791,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"thresh_adj_mat"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"ADJ_MAT_PATH_CSV = f'./Graphanalyse/adj_mat_thresh_{WEIGHT_THRESHOLD}.csv'\n",
|
||
"thresh_adj_mat.to_csv(path_or_buf=ADJ_MAT_PATH_CSV, encoding='cp1252', sep=';')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"***Testing***"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"important_words = []\n",
|
||
"all_entities = []\n",
|
||
"pos_tags = set()\n",
|
||
"pos_counter = dict()\n",
|
||
"token_counter = 0\n",
|
||
"\n",
|
||
"for description in descr:\n",
|
||
" doc = nlp(description)\n",
|
||
" \n",
|
||
" relevant_words = []\n",
|
||
" for token in doc:\n",
|
||
" POS = token.pos_\n",
|
||
" token_counter += 1\n",
|
||
" if POS in pos_counter:\n",
|
||
" pos_counter[POS] += 1\n",
|
||
" else:\n",
|
||
" pos_counter[POS] = 1\n",
|
||
" \n",
|
||
" if (not token.is_stop and not token.is_punct and \n",
|
||
" not token.is_space and (POS == 'NOUN' or \n",
|
||
" POS == 'PROPN' or \n",
|
||
" POS == 'ADJ' or \n",
|
||
" POS == 'ADV')):\n",
|
||
" relevant_words.append((token.lemma_.lower(), POS))\n",
|
||
" #pos_tags.add(token.pos_)\n",
|
||
" \n",
|
||
" entities = [] \n",
|
||
" for ent in doc.ents:\n",
|
||
" entities.append((ent.text, ent.label_))\n",
|
||
" \n",
|
||
" important_words.extend(relevant_words)\n",
|
||
" all_entities.extend(entities)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"[('täglich', 'ADJ'),\n",
|
||
" ('wartungstätigkeit', 'NOUN'),\n",
|
||
" ('vorgabe', 'NOUN'),\n",
|
||
" ('maschinenhersteller', 'NOUN'),\n",
|
||
" ('wöchentliche', 'ADJ'),\n",
|
||
" ('sichtkontrolle', 'NOUN'),\n",
|
||
" ('reinigung', 'NOUN'),\n",
|
||
" ('täglich', 'ADJ'),\n",
|
||
" ('überprüfung', 'NOUN'),\n",
|
||
" ('ölabscheider', 'NOUN'),\n",
|
||
" ('wöchentlich', 'ADJ'),\n",
|
||
" ('kontrolle', 'NOUN'),\n",
|
||
" ('wc-anlage', 'NOUN'),\n",
|
||
" ('halbjährliche', 'ADJ'),\n",
|
||
" ('kontrolle', 'NOUN'),\n",
|
||
" ('stabbreithalter', 'NOUN'),\n",
|
||
" ('brandschutztechnische', 'ADJ'),\n",
|
||
" ('prüfung', 'NOUN'),\n",
|
||
" ('prüfung', 'NOUN'),\n",
|
||
" ('scharniere', 'NOUN'),\n",
|
||
" ('dichtung', 'NOUN'),\n",
|
||
" ('schließvorrichtung', 'NOUN'),\n",
|
||
" ('schloß', 'NOUN'),\n",
|
||
" ('beschlag', 'NOUN'),\n",
|
||
" ('allgemein', 'ADJ'),\n",
|
||
" ('funktion', 'NOUN'),\n",
|
||
" ('schmierung', 'NOUN'),\n",
|
||
" ('festhaltevorrichtung', 'NOUN'),\n",
|
||
" ('täglich', 'ADJ'),\n",
|
||
" ('technikrundgang', 'NOUN'),\n",
|
||
" ('monatliche', 'ADJ'),\n",
|
||
" ('sichtkontrolle', 'NOUN'),\n",
|
||
" ('monatliche', 'ADJ'),\n",
|
||
" ('prüfung', 'NOUN'),\n",
|
||
" ('scharniere', 'NOUN'),\n",
|
||
" ('dichtung', 'NOUN'),\n",
|
||
" ('schließvorrichtung', 'NOUN'),\n",
|
||
" ('schloß', 'NOUN'),\n",
|
||
" ('beschlag', 'NOUN'),\n",
|
||
" ('allgemein', 'ADJ'),\n",
|
||
" ('funktion', 'NOUN'),\n",
|
||
" ('schmierung', 'NOUN'),\n",
|
||
" ('festhaltevorrichtung', 'NOUN')]"
|
||
]
|
||
},
|
||
"execution_count": 221,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"important_words"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"43"
|
||
]
|
||
},
|
||
"execution_count": 222,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"len(important_words)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"[]"
|
||
]
|
||
},
|
||
"execution_count": 223,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"all_entities"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"count = Counter(important_words)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Counter({('täglich', 'ADJ'): 3,\n",
|
||
" ('prüfung', 'NOUN'): 3,\n",
|
||
" ('sichtkontrolle', 'NOUN'): 2,\n",
|
||
" ('kontrolle', 'NOUN'): 2,\n",
|
||
" ('scharniere', 'NOUN'): 2,\n",
|
||
" ('dichtung', 'NOUN'): 2,\n",
|
||
" ('schließvorrichtung', 'NOUN'): 2,\n",
|
||
" ('schloß', 'NOUN'): 2,\n",
|
||
" ('beschlag', 'NOUN'): 2,\n",
|
||
" ('allgemein', 'ADJ'): 2,\n",
|
||
" ('funktion', 'NOUN'): 2,\n",
|
||
" ('schmierung', 'NOUN'): 2,\n",
|
||
" ('festhaltevorrichtung', 'NOUN'): 2,\n",
|
||
" ('monatliche', 'ADJ'): 2,\n",
|
||
" ('wartungstätigkeit', 'NOUN'): 1,\n",
|
||
" ('vorgabe', 'NOUN'): 1,\n",
|
||
" ('maschinenhersteller', 'NOUN'): 1,\n",
|
||
" ('wöchentliche', 'ADJ'): 1,\n",
|
||
" ('reinigung', 'NOUN'): 1,\n",
|
||
" ('überprüfung', 'NOUN'): 1,\n",
|
||
" ('ölabscheider', 'NOUN'): 1,\n",
|
||
" ('wöchentlich', 'ADJ'): 1,\n",
|
||
" ('wc-anlage', 'NOUN'): 1,\n",
|
||
" ('halbjährliche', 'ADJ'): 1,\n",
|
||
" ('stabbreithalter', 'NOUN'): 1,\n",
|
||
" ('brandschutztechnische', 'ADJ'): 1,\n",
|
||
" ('technikrundgang', 'NOUN'): 1})"
|
||
]
|
||
},
|
||
"execution_count": 225,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"count"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"NOUN 25722\n",
|
||
"PUNCT 11626\n",
|
||
"VERB 9093\n",
|
||
"ADP 7211\n",
|
||
"ADV 6526\n",
|
||
"PROPN 4481\n",
|
||
"NUM 4115\n",
|
||
"DET 3845\n",
|
||
"ADJ 2576\n",
|
||
"AUX 2329\n",
|
||
"PART 1561\n",
|
||
"CCONJ 1305\n",
|
||
"X 999\n",
|
||
"PRON 916\n",
|
||
"SCONJ 385\n",
|
||
"SPACE 236\n",
|
||
"INTJ 1\n",
|
||
"dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 180,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pos_count = pd.Series(data=pos_counter)\n",
|
||
"pos_count.sort_values(ascending=False)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"NOUN 0.310176\n",
|
||
"PUNCT 0.140196\n",
|
||
"VERB 0.109651\n",
|
||
"ADP 0.086956\n",
|
||
"ADV 0.078696\n",
|
||
"PROPN 0.054035\n",
|
||
"NUM 0.049622\n",
|
||
"DET 0.046366\n",
|
||
"ADJ 0.031063\n",
|
||
"AUX 0.028085\n",
|
||
"PART 0.018824\n",
|
||
"CCONJ 0.015737\n",
|
||
"X 0.012047\n",
|
||
"PRON 0.011046\n",
|
||
"SCONJ 0.004643\n",
|
||
"SPACE 0.002846\n",
|
||
"INTJ 0.000012\n",
|
||
"dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 184,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pos_count_rel = pos_count / pos_count.sum()\n",
|
||
"pos_count_rel.sort_values(ascending=False)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"82927"
|
||
]
|
||
},
|
||
"execution_count": 181,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"token_counter"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"### Weiterführende Analyse der Beschreibungen\n",
|
||
"\n",
|
||
"- unklare Zusammenhänge der 1200er-Threshold-Ergebnisse präzisieren:\n",
|
||
" - Finden der entsprechenden Beschreibungen\n",
|
||
" - Kontextualisieren\n",
|
||
"- Identifikation von weiteren Blacklistworten"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### Unklare Zusammenhänge 1200er-Threshold"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>descr</th>\n",
|
||
" <th>len</th>\n",
|
||
" <th>num_occur</th>\n",
|
||
" <th>assoc_obj_ids</th>\n",
|
||
" <th>num_assoc_obj_ids</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>index</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>161</th>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>66</td>\n",
|
||
" <td>92592</td>\n",
|
||
" <td>[0, 17, 41, 42, 43, 44, 45, 46, 47, 51, 52, 53...</td>\n",
|
||
" <td>206</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>33</th>\n",
|
||
" <td>Wöchentliche Sichtkontrolle Reinigung</td>\n",
|
||
" <td>37</td>\n",
|
||
" <td>1654</td>\n",
|
||
" <td>[301, 304, 305, 313, 314, 331, 332, 510, 511, ...</td>\n",
|
||
" <td>18</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>130</th>\n",
|
||
" <td>Tägliche Überprüfung der Ölabscheider</td>\n",
|
||
" <td>37</td>\n",
|
||
" <td>1616</td>\n",
|
||
" <td>[0, 970, 2134, 2137]</td>\n",
|
||
" <td>4</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>159</th>\n",
|
||
" <td>Wöchentliche Kontrolle der WC-Anlagen</td>\n",
|
||
" <td>37</td>\n",
|
||
" <td>1265</td>\n",
|
||
" <td>[1352, 1353, 1354, 1684, 1685, 1686, 1687, 168...</td>\n",
|
||
" <td>11</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>139</th>\n",
|
||
" <td>Halbjährliche Kontrolle des Stabbreithalters</td>\n",
|
||
" <td>44</td>\n",
|
||
" <td>687</td>\n",
|
||
" <td>[51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 6...</td>\n",
|
||
" <td>166</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2675</th>\n",
|
||
" <td>Stand 15.07.2020 Stöppel: Herr Langner Toyota ...</td>\n",
|
||
" <td>253</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[311]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2674</th>\n",
|
||
" <td>Zahnräder der Laufkatze verschlissen Ersatztei...</td>\n",
|
||
" <td>167</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[415]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2673</th>\n",
|
||
" <td>Bitte 8 Scheiben nach Muster anfertigen. Danke.</td>\n",
|
||
" <td>47</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[140]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2672</th>\n",
|
||
" <td>Schalter für Bühne Schwenken abgerissen, bitte...</td>\n",
|
||
" <td>123</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[323]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6781</th>\n",
|
||
" <td>Befestigung Deckel für Batteriefach defekt Hal...</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[326]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>6782 rows × 5 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" descr len num_occur \\\n",
|
||
"index \n",
|
||
"161 Tägliche Wartungstätigkeiten nach Vorgabe des ... 66 92592 \n",
|
||
"33 Wöchentliche Sichtkontrolle Reinigung 37 1654 \n",
|
||
"130 Tägliche Überprüfung der Ölabscheider 37 1616 \n",
|
||
"159 Wöchentliche Kontrolle der WC-Anlagen 37 1265 \n",
|
||
"139 Halbjährliche Kontrolle des Stabbreithalters 44 687 \n",
|
||
"... ... ... ... \n",
|
||
"2675 Stand 15.07.2020 Stöppel: Herr Langner Toyota ... 253 1 \n",
|
||
"2674 Zahnräder der Laufkatze verschlissen Ersatztei... 167 1 \n",
|
||
"2673 Bitte 8 Scheiben nach Muster anfertigen. Danke. 47 1 \n",
|
||
"2672 Schalter für Bühne Schwenken abgerissen, bitte... 123 1 \n",
|
||
"6781 Befestigung Deckel für Batteriefach defekt Hal... 99 1 \n",
|
||
"\n",
|
||
" assoc_obj_ids num_assoc_obj_ids \n",
|
||
"index \n",
|
||
"161 [0, 17, 41, 42, 43, 44, 45, 46, 47, 51, 52, 53... 206 \n",
|
||
"33 [301, 304, 305, 313, 314, 331, 332, 510, 511, ... 18 \n",
|
||
"130 [0, 970, 2134, 2137] 4 \n",
|
||
"159 [1352, 1353, 1354, 1684, 1685, 1686, 1687, 168... 11 \n",
|
||
"139 [51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 6... 166 \n",
|
||
"... ... ... \n",
|
||
"2675 [311] 1 \n",
|
||
"2674 [415] 1 \n",
|
||
"2673 [140] 1 \n",
|
||
"2672 [323] 1 \n",
|
||
"6781 [326] 1 \n",
|
||
"\n",
|
||
"[6782 rows x 5 columns]"
|
||
]
|
||
},
|
||
"execution_count": 54,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp1"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"temp2 = temp1.loc[temp1['num_occur'] >= 3, :]\n",
|
||
"temp2 = temp1.copy()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"#temp2 = temp2.iloc[:30,:]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"check_words = set(['E1.8'])\n",
|
||
"target_indices = list()\n",
|
||
"\n",
|
||
"for idx, row in temp2.iterrows():\n",
|
||
" \n",
|
||
" text = row['descr']\n",
|
||
" doc = nlp(text)\n",
|
||
" \n",
|
||
" token_set = set()\n",
|
||
" target_idx = None\n",
|
||
" for token in doc:\n",
|
||
" \n",
|
||
" if not (token.pos_ in POS_of_interest or token.tag_ in TAG_of_interest):\n",
|
||
" continue\n",
|
||
" \n",
|
||
" token_set.add(token.lemma_.lower())\n",
|
||
" #print(f'{token_set=}')\n",
|
||
"\n",
|
||
" if token_set.issuperset(check_words):\n",
|
||
" target_indices.append(idx)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"[]"
|
||
]
|
||
},
|
||
"execution_count": 61,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"target_indices"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'Vorgaben aus Pleva Wartungsplan Schmieren der Rollenlager der beiden Kameralaufschlitten des Strukturdetektors SD 1C siehe Extradaten'"
|
||
]
|
||
},
|
||
"execution_count": 506,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"idx = target_indices[3]\n",
|
||
"temp2.at[idx, 'descr']"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'Leiterprüfung derzeit in Arbeit Abteilungsleiter sind per Email am 11.06.2019 über deren Eigenverantwortlichkeit und Mithilfe durch Herr Graf informiert worden.'"
|
||
]
|
||
},
|
||
"execution_count": 229,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp2.at[1921,'descr']"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"True"
|
||
]
|
||
},
|
||
"execution_count": 197,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"token_set.issuperset(check_words)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"{'ADJD'}"
|
||
]
|
||
},
|
||
"execution_count": 180,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"POS_of_interest\n",
|
||
"TAG_of_interest"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"test = 'Tägliche, tägliche Wartungstätigkeit des Maschinenherstellers Maschine'"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"doc = nlp(test)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"täglich\n",
|
||
"--\n",
|
||
"täglich\n",
|
||
"wartungstätigkeit\n",
|
||
"der\n",
|
||
"maschinenhersteller\n",
|
||
"maschine\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"for token in doc:\n",
|
||
" print(token.lemma_.lower())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"replace_chars = [',', '\\n', '\\t', '\\s']"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"test = test.lower()\n",
|
||
"for char in replace_chars:\n",
|
||
" test = test.replace(char, '')\n",
|
||
"test = test.split()\n",
|
||
"test = set(test)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"{'des', 'maschine', 'maschinenherstellers', 'tägliche', 'wartungstätigkeit'}"
|
||
]
|
||
},
|
||
"execution_count": 112,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"test"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"False"
|
||
]
|
||
},
|
||
"execution_count": 104,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"test.issuperset(check_words)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**Zwischenergebnisse:**\n",
|
||
"\n",
|
||
"*bestimmte ObjektIDs haben den Escape-Charakter, andere nicht: keine ObjektID mit beiden Varianten*"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Anzahl der Duplikate = 47689 für Beschreibung mit Index-Nr. 171:\n",
|
||
" Tägliche Wartungstätigkeiten nach Vorgabe des Maschinenherstellers\n",
|
||
"\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(f\"Anzahl der Duplikate = {max_val} für Beschreibung mit Index-Nr. {index}:\\n {text}\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"---\n",
|
||
"\n",
|
||
"# Merkmal 2: VorgangsArtText"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 53,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"feature = 'VorgangsArtText'"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 54,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"base = wo_duplicates.copy()\n",
|
||
"base = base.dropna(axis=0, subset=feature)\n",
|
||
"base[feature] = base[feature].map(clean_string)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 55,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>VorgangsID</th>\n",
|
||
" <th>ObjektID</th>\n",
|
||
" <th>HObjektText</th>\n",
|
||
" <th>ObjektArtID</th>\n",
|
||
" <th>ObjektArtText</th>\n",
|
||
" <th>VorgangsTypID</th>\n",
|
||
" <th>VorgangsTypName</th>\n",
|
||
" <th>VorgangsDatum</th>\n",
|
||
" <th>VorgangsStatusId</th>\n",
|
||
" <th>VorgangsPrioritaet</th>\n",
|
||
" <th>VorgangsBeschreibung</th>\n",
|
||
" <th>VorgangsOrt</th>\n",
|
||
" <th>VorgangsArtText</th>\n",
|
||
" <th>ErledigungsDatum</th>\n",
|
||
" <th>ErledigungsArtText</th>\n",
|
||
" <th>ErledigungsBeschreibung</th>\n",
|
||
" <th>MPMelderArbeitsplatz</th>\n",
|
||
" <th>MPAbteilungBezeichnung</th>\n",
|
||
" <th>Arbeitsbeginn</th>\n",
|
||
" <th>ErstellungsDatum</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>11</td>\n",
|
||
" <td>114</td>\n",
|
||
" <td>427 C , Webmaschine, DL 280 EMS Breite 280</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Luft-Webmaschine</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-03-06</td>\n",
|
||
" <td>4</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Kettbaum kaputt</td>\n",
|
||
" <td>2019-03-06</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>NaT</td>\n",
|
||
" <td>2019-03-06</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>17</td>\n",
|
||
" <td>124</td>\n",
|
||
" <td>621 C , Webmaschine, DL 280 EMS Breite 280</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Luft-Webmaschine</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-03-11</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>asgasdg</td>\n",
|
||
" <td>2019-03-11</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Elektrowerkstatt</td>\n",
|
||
" <td>Elektrowerkstatt</td>\n",
|
||
" <td>NaT</td>\n",
|
||
" <td>2019-03-11</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>53</td>\n",
|
||
" <td>244</td>\n",
|
||
" <td>285 C, Webmaschine, SG 220 EMS</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>Greifer-Webmaschine</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-03-19</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Kupplung schleift</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Kupplung defekt</td>\n",
|
||
" <td>2019-03-20</td>\n",
|
||
" <td>Reparatur UTT</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>NaT</td>\n",
|
||
" <td>2019-03-19</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>58</td>\n",
|
||
" <td>257</td>\n",
|
||
" <td>107, Webmaschine, OM 220 EOS</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Luft-Webmaschine</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-03-21</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Gegengewicht wieder anbringen</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Gegengewicht an der Webmaschine abgefallen</td>\n",
|
||
" <td>2019-03-21</td>\n",
|
||
" <td>Reparatur UTT</td>\n",
|
||
" <td>Schraube ausgebohrt\\nGegengewicht wieder angeb...</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>2019-03-21</td>\n",
|
||
" <td>2019-03-21</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>81</td>\n",
|
||
" <td>138</td>\n",
|
||
" <td>00138, Schärmaschine 9,</td>\n",
|
||
" <td>16</td>\n",
|
||
" <td>Schärmaschine</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>da ist etwas gebrochen. (Herr Heininger)</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>zentrale Bremsenverstellung linke Gatterseite ...</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>Reparatur UTT</td>\n",
|
||
" <td>Bolzen gebrochen. Bolzen neu angefertig und di...</td>\n",
|
||
" <td>Vorwerk</td>\n",
|
||
" <td>Vorwerk</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" VorgangsID ObjektID HObjektText \\\n",
|
||
"0 11 114 427 C , Webmaschine, DL 280 EMS Breite 280 \n",
|
||
"1 17 124 621 C , Webmaschine, DL 280 EMS Breite 280 \n",
|
||
"2 53 244 285 C, Webmaschine, SG 220 EMS \n",
|
||
"3 58 257 107, Webmaschine, OM 220 EOS \n",
|
||
"4 81 138 00138, Schärmaschine 9, \n",
|
||
"\n",
|
||
" ObjektArtID ObjektArtText VorgangsTypID VorgangsTypName \\\n",
|
||
"0 3 Luft-Webmaschine 3 Reparaturauftrag (Portal) \n",
|
||
"1 3 Luft-Webmaschine 3 Reparaturauftrag (Portal) \n",
|
||
"2 5 Greifer-Webmaschine 3 Reparaturauftrag (Portal) \n",
|
||
"3 3 Luft-Webmaschine 3 Reparaturauftrag (Portal) \n",
|
||
"4 16 Schärmaschine 3 Reparaturauftrag (Portal) \n",
|
||
"\n",
|
||
" VorgangsDatum VorgangsStatusId VorgangsPrioritaet \\\n",
|
||
"0 2019-03-06 4 0 \n",
|
||
"1 2019-03-11 5 0 \n",
|
||
"2 2019-03-19 5 0 \n",
|
||
"3 2019-03-21 5 0 \n",
|
||
"4 2019-03-25 5 0 \n",
|
||
"\n",
|
||
" VorgangsBeschreibung VorgangsOrt \\\n",
|
||
"0 NaN NaN \n",
|
||
"1 NaN NaN \n",
|
||
"2 Kupplung schleift NaN \n",
|
||
"3 Gegengewicht wieder anbringen NaN \n",
|
||
"4 da ist etwas gebrochen. (Herr Heininger) NaN \n",
|
||
"\n",
|
||
" VorgangsArtText ErledigungsDatum \\\n",
|
||
"0 Kettbaum kaputt 2019-03-06 \n",
|
||
"1 asgasdg 2019-03-11 \n",
|
||
"2 Kupplung defekt 2019-03-20 \n",
|
||
"3 Gegengewicht an der Webmaschine abgefallen 2019-03-21 \n",
|
||
"4 zentrale Bremsenverstellung linke Gatterseite ... 2019-03-25 \n",
|
||
"\n",
|
||
" ErledigungsArtText ErledigungsBeschreibung \\\n",
|
||
"0 NaN NaN \n",
|
||
"1 NaN NaN \n",
|
||
"2 Reparatur UTT NaN \n",
|
||
"3 Reparatur UTT Schraube ausgebohrt\\nGegengewicht wieder angeb... \n",
|
||
"4 Reparatur UTT Bolzen gebrochen. Bolzen neu angefertig und di... \n",
|
||
"\n",
|
||
" MPMelderArbeitsplatz MPAbteilungBezeichnung Arbeitsbeginn ErstellungsDatum \n",
|
||
"0 Weberei Weberei NaT 2019-03-06 \n",
|
||
"1 Elektrowerkstatt Elektrowerkstatt NaT 2019-03-11 \n",
|
||
"2 Weberei Weberei NaT 2019-03-19 \n",
|
||
"3 Weberei Weberei 2019-03-21 2019-03-21 \n",
|
||
"4 Vorwerk Vorwerk 2019-03-25 2019-03-25 "
|
||
]
|
||
},
|
||
"execution_count": 55,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"base.head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 56,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Einträge: 128936\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"descriptions = base[feature]\n",
|
||
"print(f\"Einträge: {len(descriptions)}\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 57,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Anzahl Duplikate VorgangsArtText: 128545\n",
|
||
"Anzahl einzigartiger VorgangsArtText: 391\n",
|
||
"Anteil einzigartiger VorgangsArtText: 0.30 %\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"num_dupl_descr = descriptions.duplicated().sum()\n",
|
||
"uni_descr = descriptions.unique()\n",
|
||
"num_uni_descr = len(uni_descr)\n",
|
||
"\n",
|
||
"print(f\"Anzahl Duplikate {feature}: {num_dupl_descr}\")\n",
|
||
"print(f\"Anzahl einzigartiger {feature}: {num_uni_descr}\")\n",
|
||
"print(f\"Anteil einzigartiger {feature}: {num_uni_descr / len(descriptions) * 100:.2f} %\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 58,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"if not LOAD_CALC_FILES:\n",
|
||
" cols = ['descr', 'len', 'num_occur', 'assoc_obj_ids', 'num_assoc_obj_ids']\n",
|
||
" descr_df = pd.DataFrame(columns=cols)\n",
|
||
" max_val = 0\n",
|
||
" text = None\n",
|
||
" index = 0\n",
|
||
"\n",
|
||
"\n",
|
||
" for idx, description in enumerate(uni_descr):\n",
|
||
" len_descr = len(description)\n",
|
||
" filt = base[feature] == description\n",
|
||
" temp = base[filt]\n",
|
||
" assoc_obj_ids = temp['ObjektID'].unique()\n",
|
||
" assoc_obj_ids = np.sort(assoc_obj_ids, kind='stable')\n",
|
||
" num_assoc_obj_ids = len(assoc_obj_ids)\n",
|
||
" num_dupl = filt.sum()\n",
|
||
" \n",
|
||
" conc_df = pd.DataFrame(data=[[\n",
|
||
" description,\n",
|
||
" len_descr,\n",
|
||
" num_dupl,\n",
|
||
" assoc_obj_ids,\n",
|
||
" num_assoc_obj_ids\n",
|
||
" ]], columns=cols)\n",
|
||
" \n",
|
||
" descr_df = pd.concat([descr_df, conc_df], ignore_index=True)\n",
|
||
" \n",
|
||
" if num_dupl > max_val:\n",
|
||
" max_val = num_dupl\n",
|
||
" index = idx\n",
|
||
" text = description\n",
|
||
" \n",
|
||
" temp1 = descr_df.sort_values(by='num_occur', ascending=False)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 59,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>descr</th>\n",
|
||
" <th>len</th>\n",
|
||
" <th>num_occur</th>\n",
|
||
" <th>assoc_obj_ids</th>\n",
|
||
" <th>num_assoc_obj_ids</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>60</th>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>44</td>\n",
|
||
" <td>92719</td>\n",
|
||
" <td>[0, 17, 41, 42, 43, 44, 45, 46, 47, 51, 52, 53...</td>\n",
|
||
" <td>206</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10</th>\n",
|
||
" <td>01 Interne Reinigung Pflege Überprüfung</td>\n",
|
||
" <td>39</td>\n",
|
||
" <td>11250</td>\n",
|
||
" <td>[0, 7, 425, 426, 427, 428, 429, 517, 518, 576,...</td>\n",
|
||
" <td>349</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>28</th>\n",
|
||
" <td>02 Interne Reinigung Pflege Überprüfung</td>\n",
|
||
" <td>39</td>\n",
|
||
" <td>3263</td>\n",
|
||
" <td>[576, 906, 910, 940, 941, 942, 943, 1040, 1041...</td>\n",
|
||
" <td>52</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>29</th>\n",
|
||
" <td>Maschinen-Wartung wöchentlich</td>\n",
|
||
" <td>29</td>\n",
|
||
" <td>2408</td>\n",
|
||
" <td>[1, 301, 305, 313, 314, 331, 332, 510, 511, 51...</td>\n",
|
||
" <td>25</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>46</th>\n",
|
||
" <td>Gesetzliche Wartung Prüfung jährlich</td>\n",
|
||
" <td>36</td>\n",
|
||
" <td>2403</td>\n",
|
||
" <td>[0, 191, 193, 195, 197, 200, 287, 288, 289, 29...</td>\n",
|
||
" <td>638</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>222</th>\n",
|
||
" <td>Walze WK 03 Umlenkwalze zapfen</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[1]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>224</th>\n",
|
||
" <td>Leiter Nr. 90 und überprüfen</td>\n",
|
||
" <td>28</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[1]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>225</th>\n",
|
||
" <td>Locht nicht mehr</td>\n",
|
||
" <td>16</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[338]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>226</th>\n",
|
||
" <td>Maschine stellt immer wieder ab</td>\n",
|
||
" <td>31</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[338]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>390</th>\n",
|
||
" <td>Gesetzliche Wartung Prüfung Anlagenprüfung Dru...</td>\n",
|
||
" <td>56</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[547]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>391 rows × 5 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" descr len num_occur \\\n",
|
||
"60 Tägliche Interne Wartungstätigkeiten Weberei 44 92719 \n",
|
||
"10 01 Interne Reinigung Pflege Überprüfung 39 11250 \n",
|
||
"28 02 Interne Reinigung Pflege Überprüfung 39 3263 \n",
|
||
"29 Maschinen-Wartung wöchentlich 29 2408 \n",
|
||
"46 Gesetzliche Wartung Prüfung jährlich 36 2403 \n",
|
||
".. ... .. ... \n",
|
||
"222 Walze WK 03 Umlenkwalze zapfen 30 1 \n",
|
||
"224 Leiter Nr. 90 und überprüfen 28 1 \n",
|
||
"225 Locht nicht mehr 16 1 \n",
|
||
"226 Maschine stellt immer wieder ab 31 1 \n",
|
||
"390 Gesetzliche Wartung Prüfung Anlagenprüfung Dru... 56 1 \n",
|
||
"\n",
|
||
" assoc_obj_ids num_assoc_obj_ids \n",
|
||
"60 [0, 17, 41, 42, 43, 44, 45, 46, 47, 51, 52, 53... 206 \n",
|
||
"10 [0, 7, 425, 426, 427, 428, 429, 517, 518, 576,... 349 \n",
|
||
"28 [576, 906, 910, 940, 941, 942, 943, 1040, 1041... 52 \n",
|
||
"29 [1, 301, 305, 313, 314, 331, 332, 510, 511, 51... 25 \n",
|
||
"46 [0, 191, 193, 195, 197, 200, 287, 288, 289, 29... 638 \n",
|
||
".. ... ... \n",
|
||
"222 [1] 1 \n",
|
||
"224 [1] 1 \n",
|
||
"225 [338] 1 \n",
|
||
"226 [338] 1 \n",
|
||
"390 [547] 1 \n",
|
||
"\n",
|
||
"[391 rows x 5 columns]"
|
||
]
|
||
},
|
||
"execution_count": 59,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp1"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 60,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# save/load dataframe\n",
|
||
"FILE_PATH = f'{feature}_analyse_1.fth'\n",
|
||
"if LOAD_CALC_FILES:\n",
|
||
" temp1 = pd.read_feather(FILE_PATH)\n",
|
||
" temp1 = temp1.set_index('index')\n",
|
||
"else:\n",
|
||
" save_df = temp1.reset_index()\n",
|
||
" save_df.to_feather(FILE_PATH)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### Gesamter Datensatz"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 61,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# analysiere erste 10 Einträge\n",
|
||
"descr = temp1[['descr', 'num_occur']]\n",
|
||
"#descr = descr.iloc[50:200,:]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 62,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"#descr.iat[0,0] = 'Das ist ein Test am 24.08.2023'"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 63,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"391"
|
||
]
|
||
},
|
||
"execution_count": 63,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"len(descr)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 64,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>descr</th>\n",
|
||
" <th>num_occur</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>60</th>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>92719</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10</th>\n",
|
||
" <td>01 Interne Reinigung Pflege Überprüfung</td>\n",
|
||
" <td>11250</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>28</th>\n",
|
||
" <td>02 Interne Reinigung Pflege Überprüfung</td>\n",
|
||
" <td>3263</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>29</th>\n",
|
||
" <td>Maschinen-Wartung wöchentlich</td>\n",
|
||
" <td>2408</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>46</th>\n",
|
||
" <td>Gesetzliche Wartung Prüfung jährlich</td>\n",
|
||
" <td>2403</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>222</th>\n",
|
||
" <td>Walze WK 03 Umlenkwalze zapfen</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>224</th>\n",
|
||
" <td>Leiter Nr. 90 und überprüfen</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>225</th>\n",
|
||
" <td>Locht nicht mehr</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>226</th>\n",
|
||
" <td>Maschine stellt immer wieder ab</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>390</th>\n",
|
||
" <td>Gesetzliche Wartung Prüfung Anlagenprüfung Dru...</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>391 rows × 2 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" descr num_occur\n",
|
||
"60 Tägliche Interne Wartungstätigkeiten Weberei 92719\n",
|
||
"10 01 Interne Reinigung Pflege Überprüfung 11250\n",
|
||
"28 02 Interne Reinigung Pflege Überprüfung 3263\n",
|
||
"29 Maschinen-Wartung wöchentlich 2408\n",
|
||
"46 Gesetzliche Wartung Prüfung jährlich 2403\n",
|
||
".. ... ...\n",
|
||
"222 Walze WK 03 Umlenkwalze zapfen 1\n",
|
||
"224 Leiter Nr. 90 und überprüfen 1\n",
|
||
"225 Locht nicht mehr 1\n",
|
||
"226 Maschine stellt immer wieder ab 1\n",
|
||
"390 Gesetzliche Wartung Prüfung Anlagenprüfung Dru... 1\n",
|
||
"\n",
|
||
"[391 rows x 2 columns]"
|
||
]
|
||
},
|
||
"execution_count": 64,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"descr"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 65,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"#LOAD_CALC_FILES = True\n",
|
||
"#LOAD_CALC_FILES = False\n",
|
||
"#IS_TEST = True\n",
|
||
"IS_TEST = False"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 66,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"INFO:base:Number of entries processed: 1, Percent completed: 0.26\n"
|
||
]
|
||
},
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"INFO:base:Number of entries processed: 101, Percent completed: 25.83\n",
|
||
"INFO:base:Number of entries processed: 201, Percent completed: 51.41\n",
|
||
"INFO:base:Number of entries processed: 301, Percent completed: 76.98\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# adjacency matrix\n",
|
||
"connections = dict()\n",
|
||
"unique_tokens = set()\n",
|
||
"UPDATE_STATUS = 100\n",
|
||
"length_data = len(descr)\n",
|
||
"spell_check_candidates = set()\n",
|
||
"spell_checker = SpellChecker(language='de', distance=1)\n",
|
||
"\n",
|
||
"if not LOAD_CALC_FILES or IS_TEST:\n",
|
||
" for count, description in enumerate(descr.iterrows()):\n",
|
||
" \n",
|
||
" text = description[1]['descr']\n",
|
||
" weight = description[1]['num_occur']\n",
|
||
" \n",
|
||
" doc = nlp(text)\n",
|
||
" \n",
|
||
" obtain_descendant_info(\n",
|
||
" doc=doc,\n",
|
||
" weight=weight,\n",
|
||
" POS_of_interest=POS_of_interest,\n",
|
||
" TAG_of_interest=TAG_of_interest,\n",
|
||
" connections=connections,\n",
|
||
" unique_tokens=unique_tokens,\n",
|
||
" spell_check_candidates=spell_check_candidates,\n",
|
||
" spell_check_whitelist=spell_check_whitelist,\n",
|
||
" spell_checker=spell_checker,\n",
|
||
" corrections=corrections,\n",
|
||
" )\n",
|
||
" \n",
|
||
" if count % UPDATE_STATUS == 0:\n",
|
||
" logger.info(f'Number of entries processed: {count+1}, Percent completed: {((count+1) / length_data) * 100:.2f}')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 67,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"ADJ_DF_PATH = f'./Graphanalyse/adj_mat_df_{feature}.fth'\n",
|
||
"if not IS_TEST:\n",
|
||
" if LOAD_CALC_FILES:\n",
|
||
" adj_mat_undir = pd.read_feather(ADJ_DF_PATH)\n",
|
||
" adj_mat_undir = adj_mat_undir.set_index('index')\n",
|
||
" # additional information\n",
|
||
" connections = load_pickle('connections.pkl')\n",
|
||
" unique_tokens = load_pickle('unique_tokens.pkl')\n",
|
||
" else:\n",
|
||
" adj_mat = obtain_adj_matrix(unique_tokens=unique_tokens, connections=connections)\n",
|
||
" adj_mat_undir = make_undir_adj_matrix(adj_mat=adj_mat)\n",
|
||
" save_df = adj_mat_undir.reset_index()\n",
|
||
" save_df.to_feather(ADJ_DF_PATH)\n",
|
||
" # additional information\n",
|
||
" save_pickle(obj=connections, path='connections.pkl')\n",
|
||
" save_pickle(obj=unique_tokens, path='unique_tokens.pkl')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 68,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>lecken</th>\n",
|
||
" <th>WC</th>\n",
|
||
" <th>LKW</th>\n",
|
||
" <th>offen</th>\n",
|
||
" <th>Maschinen-Reinigung</th>\n",
|
||
" <th>Dockenwickler</th>\n",
|
||
" <th>halb-jährlich</th>\n",
|
||
" <th>Tisch</th>\n",
|
||
" <th>zentral</th>\n",
|
||
" <th>anbringen</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>undicht-</th>\n",
|
||
" <th>Platine</th>\n",
|
||
" <th>erneuern</th>\n",
|
||
" <th>Verschmutzung</th>\n",
|
||
" <th>befestigen</th>\n",
|
||
" <th>wechseln</th>\n",
|
||
" <th>Labor</th>\n",
|
||
" <th>Walze</th>\n",
|
||
" <th>anfahren</th>\n",
|
||
" <th>Leiter</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>12-monatige-Inspektion</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2-monatlich</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2-wöchentlich</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>24-monatige-Inspektion</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3-jährlich</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Ölwechsel</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Überprüfung</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>äußerer</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>überprüfen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>überziehen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>390 rows × 390 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" lecken WC LKW offen Maschinen-Reinigung \\\n",
|
||
"12-monatige-Inspektion 0 0 0 0 0 \n",
|
||
"2-monatlich 0 0 0 0 0 \n",
|
||
"2-wöchentlich 0 0 0 0 0 \n",
|
||
"24-monatige-Inspektion 0 0 0 0 0 \n",
|
||
"3-jährlich 0 0 0 0 0 \n",
|
||
"... ... .. ... ... ... \n",
|
||
"Ölwechsel 0 0 0 0 0 \n",
|
||
"Überprüfung 0 0 0 0 0 \n",
|
||
"äußerer 0 0 0 0 0 \n",
|
||
"überprüfen 0 0 0 0 0 \n",
|
||
"überziehen 0 0 0 0 0 \n",
|
||
"\n",
|
||
" Dockenwickler halb-jährlich Tisch zentral \\\n",
|
||
"12-monatige-Inspektion 0 0 0 0 \n",
|
||
"2-monatlich 0 0 0 0 \n",
|
||
"2-wöchentlich 0 0 0 0 \n",
|
||
"24-monatige-Inspektion 0 0 0 0 \n",
|
||
"3-jährlich 0 0 0 0 \n",
|
||
"... ... ... ... ... \n",
|
||
"Ölwechsel 0 0 0 0 \n",
|
||
"Überprüfung 0 0 0 0 \n",
|
||
"äußerer 0 0 0 0 \n",
|
||
"überprüfen 0 0 0 0 \n",
|
||
"überziehen 0 0 0 0 \n",
|
||
"\n",
|
||
" anbringen ... undicht- Platine erneuern \\\n",
|
||
"12-monatige-Inspektion 0 ... 0 0 0 \n",
|
||
"2-monatlich 0 ... 0 0 0 \n",
|
||
"2-wöchentlich 0 ... 0 0 0 \n",
|
||
"24-monatige-Inspektion 0 ... 0 0 0 \n",
|
||
"3-jährlich 0 ... 0 0 0 \n",
|
||
"... ... ... ... ... ... \n",
|
||
"Ölwechsel 0 ... 0 0 0 \n",
|
||
"Überprüfung 0 ... 0 0 0 \n",
|
||
"äußerer 0 ... 0 0 0 \n",
|
||
"überprüfen 0 ... 0 0 0 \n",
|
||
"überziehen 0 ... 0 0 0 \n",
|
||
"\n",
|
||
" Verschmutzung befestigen wechseln Labor Walze \\\n",
|
||
"12-monatige-Inspektion 0 0 0 0 0 \n",
|
||
"2-monatlich 0 0 0 0 0 \n",
|
||
"2-wöchentlich 0 0 0 0 0 \n",
|
||
"24-monatige-Inspektion 0 0 0 0 0 \n",
|
||
"3-jährlich 0 0 0 0 0 \n",
|
||
"... ... ... ... ... ... \n",
|
||
"Ölwechsel 0 0 0 0 0 \n",
|
||
"Überprüfung 0 0 0 0 0 \n",
|
||
"äußerer 0 0 0 0 0 \n",
|
||
"überprüfen 0 0 0 0 0 \n",
|
||
"überziehen 0 0 0 0 1 \n",
|
||
"\n",
|
||
" anfahren Leiter \n",
|
||
"12-monatige-Inspektion 0 0 \n",
|
||
"2-monatlich 0 0 \n",
|
||
"2-wöchentlich 0 0 \n",
|
||
"24-monatige-Inspektion 0 0 \n",
|
||
"3-jährlich 0 0 \n",
|
||
"... ... ... \n",
|
||
"Ölwechsel 0 0 \n",
|
||
"Überprüfung 0 0 \n",
|
||
"äußerer 0 0 \n",
|
||
"überprüfen 0 1 \n",
|
||
"überziehen 0 0 \n",
|
||
"\n",
|
||
"[390 rows x 390 columns]"
|
||
]
|
||
},
|
||
"execution_count": 68,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"adj_mat_undir.sort_index()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 69,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"arr = adj_mat_undir.to_numpy()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 70,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"391"
|
||
]
|
||
},
|
||
"execution_count": 70,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"np.count_nonzero(arr)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 71,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"92964"
|
||
]
|
||
},
|
||
"execution_count": 71,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"np.max(arr)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Threshold"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 162,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"WEIGHT_THRESHOLD = 0"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 163,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"arr = adj_mat_undir.to_numpy()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 164,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"arr = np.where(arr < WEIGHT_THRESHOLD, 0, arr)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 165,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"391"
|
||
]
|
||
},
|
||
"execution_count": 165,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"np.count_nonzero(arr)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 166,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"233"
|
||
]
|
||
},
|
||
"execution_count": 166,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp = np.sum(arr, axis=0)\n",
|
||
"np.count_nonzero(temp)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 167,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"thresh_adj_mat = adj_mat_undir.copy()\n",
|
||
"thresh_adj_mat.loc[:] = arr"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 168,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Wasserleitung</th>\n",
|
||
" <th>wechseln</th>\n",
|
||
" <th>Winkelpositionsgeber</th>\n",
|
||
" <th>Klimaanlagengerät</th>\n",
|
||
" <th>versetzen</th>\n",
|
||
" <th>Brennschlitten</th>\n",
|
||
" <th>feststellen</th>\n",
|
||
" <th>Stuhl</th>\n",
|
||
" <th>monatlich</th>\n",
|
||
" <th>anfertigen</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>Zahnriemen</th>\n",
|
||
" <th>Rampe</th>\n",
|
||
" <th>Tisch</th>\n",
|
||
" <th>defekt</th>\n",
|
||
" <th>Elektrische</th>\n",
|
||
" <th>haben</th>\n",
|
||
" <th>Wasserenthärtungsanlage</th>\n",
|
||
" <th>Gestank</th>\n",
|
||
" <th>Zahnrad</th>\n",
|
||
" <th>hydraulisch</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>Wasserleitung</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>wechseln</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Winkelpositionsgeber</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Klimaanlagengerät</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>versetzen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>haben</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Wasserenthärtungsanlage</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Gestank</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Zahnrad</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>hydraulisch</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>390 rows × 390 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Wasserleitung wechseln Winkelpositionsgeber \\\n",
|
||
"Wasserleitung 0 0 0 \n",
|
||
"wechseln 0 0 0 \n",
|
||
"Winkelpositionsgeber 0 0 0 \n",
|
||
"Klimaanlagengerät 0 0 0 \n",
|
||
"versetzen 0 0 0 \n",
|
||
"... ... ... ... \n",
|
||
"haben 0 0 0 \n",
|
||
"Wasserenthärtungsanlage 0 0 0 \n",
|
||
"Gestank 0 0 0 \n",
|
||
"Zahnrad 0 0 0 \n",
|
||
"hydraulisch 0 0 0 \n",
|
||
"\n",
|
||
" Klimaanlagengerät versetzen Brennschlitten \\\n",
|
||
"Wasserleitung 0 0 0 \n",
|
||
"wechseln 0 0 0 \n",
|
||
"Winkelpositionsgeber 0 0 0 \n",
|
||
"Klimaanlagengerät 0 0 0 \n",
|
||
"versetzen 0 0 0 \n",
|
||
"... ... ... ... \n",
|
||
"haben 0 0 0 \n",
|
||
"Wasserenthärtungsanlage 0 0 0 \n",
|
||
"Gestank 0 0 0 \n",
|
||
"Zahnrad 0 0 0 \n",
|
||
"hydraulisch 0 0 0 \n",
|
||
"\n",
|
||
" feststellen Stuhl monatlich anfertigen ... \\\n",
|
||
"Wasserleitung 0 0 0 0 ... \n",
|
||
"wechseln 0 0 0 0 ... \n",
|
||
"Winkelpositionsgeber 0 0 0 0 ... \n",
|
||
"Klimaanlagengerät 0 0 0 0 ... \n",
|
||
"versetzen 0 0 0 0 ... \n",
|
||
"... ... ... ... ... ... \n",
|
||
"haben 0 0 0 0 ... \n",
|
||
"Wasserenthärtungsanlage 0 0 0 0 ... \n",
|
||
"Gestank 0 0 0 0 ... \n",
|
||
"Zahnrad 0 0 0 0 ... \n",
|
||
"hydraulisch 0 0 0 0 ... \n",
|
||
"\n",
|
||
" Zahnriemen Rampe Tisch defekt Elektrische haben \\\n",
|
||
"Wasserleitung 0 0 0 0 0 0 \n",
|
||
"wechseln 0 0 0 0 0 0 \n",
|
||
"Winkelpositionsgeber 0 0 0 1 0 0 \n",
|
||
"Klimaanlagengerät 0 0 0 0 0 0 \n",
|
||
"versetzen 0 0 0 0 0 0 \n",
|
||
"... ... ... ... ... ... ... \n",
|
||
"haben 0 0 0 0 0 0 \n",
|
||
"Wasserenthärtungsanlage 0 0 0 0 0 0 \n",
|
||
"Gestank 0 0 0 0 0 0 \n",
|
||
"Zahnrad 0 0 0 0 0 0 \n",
|
||
"hydraulisch 0 0 0 0 0 0 \n",
|
||
"\n",
|
||
" Wasserenthärtungsanlage Gestank Zahnrad \\\n",
|
||
"Wasserleitung 0 0 0 \n",
|
||
"wechseln 0 0 0 \n",
|
||
"Winkelpositionsgeber 0 0 0 \n",
|
||
"Klimaanlagengerät 0 0 0 \n",
|
||
"versetzen 0 0 0 \n",
|
||
"... ... ... ... \n",
|
||
"haben 0 0 0 \n",
|
||
"Wasserenthärtungsanlage 0 0 0 \n",
|
||
"Gestank 0 0 0 \n",
|
||
"Zahnrad 0 0 0 \n",
|
||
"hydraulisch 0 0 0 \n",
|
||
"\n",
|
||
" hydraulisch \n",
|
||
"Wasserleitung 0 \n",
|
||
"wechseln 0 \n",
|
||
"Winkelpositionsgeber 0 \n",
|
||
"Klimaanlagengerät 0 \n",
|
||
"versetzen 0 \n",
|
||
"... ... \n",
|
||
"haben 0 \n",
|
||
"Wasserenthärtungsanlage 0 \n",
|
||
"Gestank 0 \n",
|
||
"Zahnrad 0 \n",
|
||
"hydraulisch 0 \n",
|
||
"\n",
|
||
"[390 rows x 390 columns]"
|
||
]
|
||
},
|
||
"execution_count": 168,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"thresh_adj_mat"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 169,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"ADJ_MAT_PATH_CSV = f'./Graphanalyse/adj_mat_thresh_{feature}_{WEIGHT_THRESHOLD}.csv'\n",
|
||
"thresh_adj_mat.to_csv(path_or_buf=ADJ_MAT_PATH_CSV, encoding='cp1252', sep=';')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"---\n",
|
||
"\n",
|
||
"# Merkmal 3: ErledigungsBeschreibung"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 72,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"feature = 'ErledigungsBeschreibung'"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 73,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"base = wo_duplicates.copy()\n",
|
||
"base = base.dropna(axis=0, subset=feature)\n",
|
||
"base[feature] = base[feature].map(clean_string)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 74,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>VorgangsID</th>\n",
|
||
" <th>ObjektID</th>\n",
|
||
" <th>HObjektText</th>\n",
|
||
" <th>ObjektArtID</th>\n",
|
||
" <th>ObjektArtText</th>\n",
|
||
" <th>VorgangsTypID</th>\n",
|
||
" <th>VorgangsTypName</th>\n",
|
||
" <th>VorgangsDatum</th>\n",
|
||
" <th>VorgangsStatusId</th>\n",
|
||
" <th>VorgangsPrioritaet</th>\n",
|
||
" <th>VorgangsBeschreibung</th>\n",
|
||
" <th>VorgangsOrt</th>\n",
|
||
" <th>VorgangsArtText</th>\n",
|
||
" <th>ErledigungsDatum</th>\n",
|
||
" <th>ErledigungsArtText</th>\n",
|
||
" <th>ErledigungsBeschreibung</th>\n",
|
||
" <th>MPMelderArbeitsplatz</th>\n",
|
||
" <th>MPAbteilungBezeichnung</th>\n",
|
||
" <th>Arbeitsbeginn</th>\n",
|
||
" <th>ErstellungsDatum</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>58</td>\n",
|
||
" <td>257</td>\n",
|
||
" <td>107, Webmaschine, OM 220 EOS</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Luft-Webmaschine</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-03-21</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Gegengewicht wieder anbringen</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Gegengewicht an der Webmaschine abgefallen</td>\n",
|
||
" <td>2019-03-21</td>\n",
|
||
" <td>Reparatur UTT</td>\n",
|
||
" <td>Schraube ausgebohrt Gegengewicht wieder angebr...</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>2019-03-21</td>\n",
|
||
" <td>2019-03-21</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>81</td>\n",
|
||
" <td>138</td>\n",
|
||
" <td>00138, Schärmaschine 9,</td>\n",
|
||
" <td>16</td>\n",
|
||
" <td>Schärmaschine</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>da ist etwas gebrochen. (Herr Heininger)</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>zentrale Bremsenverstellung linke Gatterseite ...</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>Reparatur UTT</td>\n",
|
||
" <td>Bolzen gebrochen. Bolzen neu angefertig und di...</td>\n",
|
||
" <td>Vorwerk</td>\n",
|
||
" <td>Vorwerk</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>82</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Warenschau allgemein</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Klappbügel Portalkran H31 defekt</td>\n",
|
||
" <td>Warenschau allgemein</td>\n",
|
||
" <td>Allgemeine Reparaturarbeiten</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>Reparatur UTT</td>\n",
|
||
" <td>Feder ausgetauscht</td>\n",
|
||
" <td>Warenschau</td>\n",
|
||
" <td>Warenschau</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>76</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Neben der Türe</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-03-22</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Schraube nix mer gut</td>\n",
|
||
" <td>Neben der Türe</td>\n",
|
||
" <td>Kettbaum</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>Reparatur UTT</td>\n",
|
||
" <td>Schrauben ausgebohrt Gewinde nachgeschnitten</td>\n",
|
||
" <td>Vorwerk</td>\n",
|
||
" <td>Vorwerk</td>\n",
|
||
" <td>2019-03-25</td>\n",
|
||
" <td>2019-03-22</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8</th>\n",
|
||
" <td>111</td>\n",
|
||
" <td>241</td>\n",
|
||
" <td>294 C, Webmaschine, SG 240 EMS</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>Greifer-Webmaschine</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Reparaturauftrag (Portal)</td>\n",
|
||
" <td>2019-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>KBK tauschen\\nUrsache vermutlich mechanisch</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Kupplung-Brems-Kombination</td>\n",
|
||
" <td>2019-04-08</td>\n",
|
||
" <td>Reparatur UTT</td>\n",
|
||
" <td>da derzeit Keine Ersatzteile da Reparatur mit ...</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>Weberei</td>\n",
|
||
" <td>2019-04-02</td>\n",
|
||
" <td>2019-04-01</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" VorgangsID ObjektID HObjektText ObjektArtID \\\n",
|
||
"3 58 257 107, Webmaschine, OM 220 EOS 3 \n",
|
||
"4 81 138 00138, Schärmaschine 9, 16 \n",
|
||
"5 82 0 Warenschau allgemein 0 \n",
|
||
"6 76 0 Neben der Türe 0 \n",
|
||
"8 111 241 294 C, Webmaschine, SG 240 EMS 5 \n",
|
||
"\n",
|
||
" ObjektArtText VorgangsTypID VorgangsTypName \\\n",
|
||
"3 Luft-Webmaschine 3 Reparaturauftrag (Portal) \n",
|
||
"4 Schärmaschine 3 Reparaturauftrag (Portal) \n",
|
||
"5 NaN 3 Reparaturauftrag (Portal) \n",
|
||
"6 NaN 3 Reparaturauftrag (Portal) \n",
|
||
"8 Greifer-Webmaschine 3 Reparaturauftrag (Portal) \n",
|
||
"\n",
|
||
" VorgangsDatum VorgangsStatusId VorgangsPrioritaet \\\n",
|
||
"3 2019-03-21 5 0 \n",
|
||
"4 2019-03-25 5 0 \n",
|
||
"5 2019-03-25 5 0 \n",
|
||
"6 2019-03-22 5 0 \n",
|
||
"8 2019-04-01 5 0 \n",
|
||
"\n",
|
||
" VorgangsBeschreibung VorgangsOrt \\\n",
|
||
"3 Gegengewicht wieder anbringen NaN \n",
|
||
"4 da ist etwas gebrochen. (Herr Heininger) NaN \n",
|
||
"5 Klappbügel Portalkran H31 defekt Warenschau allgemein \n",
|
||
"6 Schraube nix mer gut Neben der Türe \n",
|
||
"8 KBK tauschen\\nUrsache vermutlich mechanisch NaN \n",
|
||
"\n",
|
||
" VorgangsArtText ErledigungsDatum \\\n",
|
||
"3 Gegengewicht an der Webmaschine abgefallen 2019-03-21 \n",
|
||
"4 zentrale Bremsenverstellung linke Gatterseite ... 2019-03-25 \n",
|
||
"5 Allgemeine Reparaturarbeiten 2019-03-25 \n",
|
||
"6 Kettbaum 2019-03-25 \n",
|
||
"8 Kupplung-Brems-Kombination 2019-04-08 \n",
|
||
"\n",
|
||
" ErledigungsArtText ErledigungsBeschreibung \\\n",
|
||
"3 Reparatur UTT Schraube ausgebohrt Gegengewicht wieder angebr... \n",
|
||
"4 Reparatur UTT Bolzen gebrochen. Bolzen neu angefertig und di... \n",
|
||
"5 Reparatur UTT Feder ausgetauscht \n",
|
||
"6 Reparatur UTT Schrauben ausgebohrt Gewinde nachgeschnitten \n",
|
||
"8 Reparatur UTT da derzeit Keine Ersatzteile da Reparatur mit ... \n",
|
||
"\n",
|
||
" MPMelderArbeitsplatz MPAbteilungBezeichnung Arbeitsbeginn ErstellungsDatum \n",
|
||
"3 Weberei Weberei 2019-03-21 2019-03-21 \n",
|
||
"4 Vorwerk Vorwerk 2019-03-25 2019-03-25 \n",
|
||
"5 Warenschau Warenschau 2019-03-25 2019-03-25 \n",
|
||
"6 Vorwerk Vorwerk 2019-03-25 2019-03-22 \n",
|
||
"8 Weberei Weberei 2019-04-02 2019-04-01 "
|
||
]
|
||
},
|
||
"execution_count": 74,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"base.head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 75,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Einträge: 118086\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"descriptions = base[feature]\n",
|
||
"print(f\"Einträge: {len(descriptions)}\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 76,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Anzahl Duplikate ErledigungsBeschreibung: 110707\n",
|
||
"Anzahl einzigartiger ErledigungsBeschreibung: 7379\n",
|
||
"Anteil einzigartiger ErledigungsBeschreibung: 6.25 %\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"num_dupl_descr = descriptions.duplicated().sum()\n",
|
||
"uni_descr = descriptions.unique()\n",
|
||
"num_uni_descr = len(uni_descr)\n",
|
||
"\n",
|
||
"print(f\"Anzahl Duplikate {feature}: {num_dupl_descr}\")\n",
|
||
"print(f\"Anzahl einzigartiger {feature}: {num_uni_descr}\")\n",
|
||
"print(f\"Anteil einzigartiger {feature}: {num_uni_descr / len(descriptions) * 100:.2f} %\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 77,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"False"
|
||
]
|
||
},
|
||
"execution_count": 77,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"LOAD_CALC_FILES"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 78,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"if not LOAD_CALC_FILES:\n",
|
||
" cols = ['descr', 'len', 'num_occur', 'assoc_obj_ids', 'num_assoc_obj_ids']\n",
|
||
" descr_df = pd.DataFrame(columns=cols)\n",
|
||
" max_val = 0\n",
|
||
" text = None\n",
|
||
" index = 0\n",
|
||
"\n",
|
||
"\n",
|
||
" for idx, description in enumerate(uni_descr):\n",
|
||
" len_descr = len(description)\n",
|
||
" filt = base[feature] == description\n",
|
||
" temp = base[filt]\n",
|
||
" assoc_obj_ids = temp['ObjektID'].unique()\n",
|
||
" assoc_obj_ids = np.sort(assoc_obj_ids, kind='stable')\n",
|
||
" num_assoc_obj_ids = len(assoc_obj_ids)\n",
|
||
" num_dupl = filt.sum()\n",
|
||
" \n",
|
||
" conc_df = pd.DataFrame(data=[[\n",
|
||
" description,\n",
|
||
" len_descr,\n",
|
||
" num_dupl,\n",
|
||
" assoc_obj_ids,\n",
|
||
" num_assoc_obj_ids\n",
|
||
" ]], columns=cols)\n",
|
||
" \n",
|
||
" descr_df = pd.concat([descr_df, conc_df], ignore_index=True)\n",
|
||
" \n",
|
||
" if num_dupl > max_val:\n",
|
||
" max_val = num_dupl\n",
|
||
" index = idx\n",
|
||
" text = description\n",
|
||
" \n",
|
||
" temp1 = descr_df.sort_values(by='num_occur', ascending=False)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 79,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>descr</th>\n",
|
||
" <th>len</th>\n",
|
||
" <th>num_occur</th>\n",
|
||
" <th>assoc_obj_ids</th>\n",
|
||
" <th>num_assoc_obj_ids</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>112</th>\n",
|
||
" <td>Sichtkontrolle durchgeführt Auffälligkeiten fe...</td>\n",
|
||
" <td>95</td>\n",
|
||
" <td>98720</td>\n",
|
||
" <td>[0, 1, 7, 17, 41, 42, 43, 44, 45, 46, 47, 51, ...</td>\n",
|
||
" <td>953</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>108</th>\n",
|
||
" <td>Sichtkontrolle durchgeführt Auffälligkeiten fe...</td>\n",
|
||
" <td>100</td>\n",
|
||
" <td>1450</td>\n",
|
||
" <td>[0, 1, 140, 301, 305, 313, 314, 576, 970, 1110...</td>\n",
|
||
" <td>28</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>147</th>\n",
|
||
" <td>Externe Prüfung wurde durchgeführt Beanstandun...</td>\n",
|
||
" <td>119</td>\n",
|
||
" <td>1082</td>\n",
|
||
" <td>[191, 193, 195, 197, 200, 264, 287, 288, 289, ...</td>\n",
|
||
" <td>413</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>128</th>\n",
|
||
" <td>Reinigung durchgeführt Auffälligkeiten festges...</td>\n",
|
||
" <td>90</td>\n",
|
||
" <td>762</td>\n",
|
||
" <td>[0, 1, 7, 123, 136, 137, 138, 177, 298, 304, 3...</td>\n",
|
||
" <td>90</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>96</th>\n",
|
||
" <td>Sichtkontrolle wie festgelegt durchgeführt Auf...</td>\n",
|
||
" <td>110</td>\n",
|
||
" <td>648</td>\n",
|
||
" <td>[1, 20, 21, 51, 52, 53, 54, 55, 56, 64, 65, 66...</td>\n",
|
||
" <td>271</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2805</th>\n",
|
||
" <td>X Achse Süd Führungswägen Kurze Version eingebaut</td>\n",
|
||
" <td>49</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[21]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2804</th>\n",
|
||
" <td>Maschinenrahmen ausgerichtet und ausgebeult. M...</td>\n",
|
||
" <td>90</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[144]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2803</th>\n",
|
||
" <td>Bügel und Stützräder getauscht</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[315]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2802</th>\n",
|
||
" <td>Graf: TK wurde in Arbeitsauftrag 65487 gewandelt</td>\n",
|
||
" <td>48</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[405]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7378</th>\n",
|
||
" <td>Neue Gasfeder eingebaut</td>\n",
|
||
" <td>23</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>[326]</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>7379 rows × 5 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" descr len num_occur \\\n",
|
||
"112 Sichtkontrolle durchgeführt Auffälligkeiten fe... 95 98720 \n",
|
||
"108 Sichtkontrolle durchgeführt Auffälligkeiten fe... 100 1450 \n",
|
||
"147 Externe Prüfung wurde durchgeführt Beanstandun... 119 1082 \n",
|
||
"128 Reinigung durchgeführt Auffälligkeiten festges... 90 762 \n",
|
||
"96 Sichtkontrolle wie festgelegt durchgeführt Auf... 110 648 \n",
|
||
"... ... ... ... \n",
|
||
"2805 X Achse Süd Führungswägen Kurze Version eingebaut 49 1 \n",
|
||
"2804 Maschinenrahmen ausgerichtet und ausgebeult. M... 90 1 \n",
|
||
"2803 Bügel und Stützräder getauscht 30 1 \n",
|
||
"2802 Graf: TK wurde in Arbeitsauftrag 65487 gewandelt 48 1 \n",
|
||
"7378 Neue Gasfeder eingebaut 23 1 \n",
|
||
"\n",
|
||
" assoc_obj_ids num_assoc_obj_ids \n",
|
||
"112 [0, 1, 7, 17, 41, 42, 43, 44, 45, 46, 47, 51, ... 953 \n",
|
||
"108 [0, 1, 140, 301, 305, 313, 314, 576, 970, 1110... 28 \n",
|
||
"147 [191, 193, 195, 197, 200, 264, 287, 288, 289, ... 413 \n",
|
||
"128 [0, 1, 7, 123, 136, 137, 138, 177, 298, 304, 3... 90 \n",
|
||
"96 [1, 20, 21, 51, 52, 53, 54, 55, 56, 64, 65, 66... 271 \n",
|
||
"... ... ... \n",
|
||
"2805 [21] 1 \n",
|
||
"2804 [144] 1 \n",
|
||
"2803 [315] 1 \n",
|
||
"2802 [405] 1 \n",
|
||
"7378 [326] 1 \n",
|
||
"\n",
|
||
"[7379 rows x 5 columns]"
|
||
]
|
||
},
|
||
"execution_count": 79,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp1"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 81,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'Sichtkontrolle durchgeführt Auffälligkeiten festgestellt vom Ausführenden bitte dazu schreiben:'"
|
||
]
|
||
},
|
||
"execution_count": 81,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp1.iat[0,0]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 82,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'Sichtkontrolle durchgeführt Auffälligkeiten festgestellt vom Ausführenden bitte dazu schreiben: Nein'"
|
||
]
|
||
},
|
||
"execution_count": 82,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp1.iat[1,0]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 83,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# save/load dataframe\n",
|
||
"FILE_PATH = f'{feature}_analyse_1.fth'\n",
|
||
"if LOAD_CALC_FILES:\n",
|
||
" temp1 = pd.read_feather(FILE_PATH)\n",
|
||
" temp1 = temp1.set_index('index')\n",
|
||
"else:\n",
|
||
" save_df = temp1.reset_index()\n",
|
||
" save_df.to_feather(FILE_PATH)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### Gesamter Datensatz"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 84,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# analysiere erste 10 Einträge\n",
|
||
"descr = temp1[['descr', 'num_occur']]\n",
|
||
"#descr = descr.iloc[50:200,:]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 85,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"#descr.iat[0,0] = 'Das ist ein Test am 24.08.2023'"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 86,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"7379"
|
||
]
|
||
},
|
||
"execution_count": 86,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"len(descr)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 87,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>descr</th>\n",
|
||
" <th>num_occur</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>112</th>\n",
|
||
" <td>Sichtkontrolle durchgeführt Auffälligkeiten fe...</td>\n",
|
||
" <td>98720</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>108</th>\n",
|
||
" <td>Sichtkontrolle durchgeführt Auffälligkeiten fe...</td>\n",
|
||
" <td>1450</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>147</th>\n",
|
||
" <td>Externe Prüfung wurde durchgeführt Beanstandun...</td>\n",
|
||
" <td>1082</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>128</th>\n",
|
||
" <td>Reinigung durchgeführt Auffälligkeiten festges...</td>\n",
|
||
" <td>762</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>96</th>\n",
|
||
" <td>Sichtkontrolle wie festgelegt durchgeführt Auf...</td>\n",
|
||
" <td>648</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2805</th>\n",
|
||
" <td>X Achse Süd Führungswägen Kurze Version eingebaut</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2804</th>\n",
|
||
" <td>Maschinenrahmen ausgerichtet und ausgebeult. M...</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2803</th>\n",
|
||
" <td>Bügel und Stützräder getauscht</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2802</th>\n",
|
||
" <td>Graf: TK wurde in Arbeitsauftrag 65487 gewandelt</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7378</th>\n",
|
||
" <td>Neue Gasfeder eingebaut</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>7379 rows × 2 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" descr num_occur\n",
|
||
"112 Sichtkontrolle durchgeführt Auffälligkeiten fe... 98720\n",
|
||
"108 Sichtkontrolle durchgeführt Auffälligkeiten fe... 1450\n",
|
||
"147 Externe Prüfung wurde durchgeführt Beanstandun... 1082\n",
|
||
"128 Reinigung durchgeführt Auffälligkeiten festges... 762\n",
|
||
"96 Sichtkontrolle wie festgelegt durchgeführt Auf... 648\n",
|
||
"... ... ...\n",
|
||
"2805 X Achse Süd Führungswägen Kurze Version eingebaut 1\n",
|
||
"2804 Maschinenrahmen ausgerichtet und ausgebeult. M... 1\n",
|
||
"2803 Bügel und Stützräder getauscht 1\n",
|
||
"2802 Graf: TK wurde in Arbeitsauftrag 65487 gewandelt 1\n",
|
||
"7378 Neue Gasfeder eingebaut 1\n",
|
||
"\n",
|
||
"[7379 rows x 2 columns]"
|
||
]
|
||
},
|
||
"execution_count": 87,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"descr"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 88,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"#LOAD_CALC_FILES = True\n",
|
||
"#LOAD_CALC_FILES = False\n",
|
||
"#IS_TEST = True\n",
|
||
"IS_TEST = False"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 89,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"INFO:base:Number of entries processed: 1, Percent completed: 0.01\n",
|
||
"INFO:base:Number of entries processed: 501, Percent completed: 6.79\n",
|
||
"INFO:base:Number of entries processed: 1001, Percent completed: 13.57\n",
|
||
"INFO:base:Number of entries processed: 1501, Percent completed: 20.34\n",
|
||
"INFO:base:Number of entries processed: 2001, Percent completed: 27.12\n",
|
||
"INFO:base:Number of entries processed: 2501, Percent completed: 33.89\n",
|
||
"INFO:base:Number of entries processed: 3001, Percent completed: 40.67\n",
|
||
"INFO:base:Number of entries processed: 3501, Percent completed: 47.45\n",
|
||
"INFO:base:Number of entries processed: 4001, Percent completed: 54.22\n",
|
||
"INFO:base:Number of entries processed: 4501, Percent completed: 61.00\n",
|
||
"INFO:base:Number of entries processed: 5001, Percent completed: 67.77\n",
|
||
"INFO:base:Number of entries processed: 5501, Percent completed: 74.55\n",
|
||
"INFO:base:Number of entries processed: 6001, Percent completed: 81.33\n",
|
||
"INFO:base:Number of entries processed: 6501, Percent completed: 88.10\n",
|
||
"INFO:base:Number of entries processed: 7001, Percent completed: 94.88\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# adjacency matrix\n",
|
||
"connections = dict()\n",
|
||
"unique_tokens = set()\n",
|
||
"UPDATE_STATUS = 500\n",
|
||
"length_data = len(descr)\n",
|
||
"spell_check_candidates = set()\n",
|
||
"spell_checker = SpellChecker(language='de', distance=1)\n",
|
||
"\n",
|
||
"if not LOAD_CALC_FILES or IS_TEST:\n",
|
||
" for count, description in enumerate(descr.iterrows()):\n",
|
||
" \n",
|
||
" text = description[1]['descr']\n",
|
||
" weight = description[1]['num_occur']\n",
|
||
" \n",
|
||
" doc = nlp(text)\n",
|
||
" \n",
|
||
" obtain_descendant_info(\n",
|
||
" doc=doc,\n",
|
||
" weight=weight,\n",
|
||
" POS_of_interest=POS_of_interest,\n",
|
||
" TAG_of_interest=TAG_of_interest,\n",
|
||
" connections=connections,\n",
|
||
" unique_tokens=unique_tokens,\n",
|
||
" spell_check_candidates=spell_check_candidates,\n",
|
||
" spell_check_whitelist=spell_check_whitelist,\n",
|
||
" spell_checker=spell_checker,\n",
|
||
" corrections=corrections,\n",
|
||
" )\n",
|
||
" \n",
|
||
" if count % UPDATE_STATUS == 0:\n",
|
||
" logger.info(f'Number of entries processed: {count+1}, Percent completed: {((count+1) / length_data) * 100:.2f}')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 93,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"ADJ_DF_PATH = f'./Graphanalyse/adj_mat_df_{feature}.fth'\n",
|
||
"if not IS_TEST:\n",
|
||
" if LOAD_CALC_FILES:\n",
|
||
" adj_mat_undir = pd.read_feather(ADJ_DF_PATH)\n",
|
||
" adj_mat_undir = adj_mat_undir.set_index('index')\n",
|
||
" # additional information\n",
|
||
" connections = load_pickle('connections.pkl')\n",
|
||
" unique_tokens = load_pickle('unique_tokens.pkl')\n",
|
||
" else:\n",
|
||
" adj_mat = obtain_adj_matrix(unique_tokens=unique_tokens, connections=connections)\n",
|
||
" adj_mat_undir = make_undir_adj_matrix(adj_mat=adj_mat)\n",
|
||
" save_df = adj_mat_undir.reset_index()\n",
|
||
" save_df.to_feather(ADJ_DF_PATH)\n",
|
||
" # additional information\n",
|
||
" save_pickle(obj=connections, path='connections.pkl')\n",
|
||
" save_pickle(obj=unique_tokens, path='unique_tokens.pkl')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 94,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>funktionsfähig</th>\n",
|
||
" <th>Zwischenbehälter</th>\n",
|
||
" <th>Ölfilter</th>\n",
|
||
" <th>Rechter</th>\n",
|
||
" <th>Kontaktproblem</th>\n",
|
||
" <th>Geschweisst</th>\n",
|
||
" <th>vorbereiten</th>\n",
|
||
" <th>Gelenkbolzen</th>\n",
|
||
" <th>Silikonfass</th>\n",
|
||
" <th>Ausbau</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>Kom</th>\n",
|
||
" <th>anlernen</th>\n",
|
||
" <th>nah</th>\n",
|
||
" <th>Begutachtung</th>\n",
|
||
" <th>Betriebszeit</th>\n",
|
||
" <th>paletten</th>\n",
|
||
" <th>augetreten</th>\n",
|
||
" <th>Antriebszahnrad</th>\n",
|
||
" <th>Gewindereparaturset</th>\n",
|
||
" <th>Heizventil</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>-20C</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>-Befestihgung</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>-Einlaufwalze</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>-Entlüftungssicherung</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>-Faltbalken</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>überzogenn</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>überzoggen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>übrtprüfen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>ünerziehen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>üperprüfen</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>6946 rows × 6946 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" funktionsfähig Zwischenbehälter Ölfilter Rechter \\\n",
|
||
"-20C 0 0 0 0 \n",
|
||
"-Befestihgung 0 0 0 0 \n",
|
||
"-Einlaufwalze 0 0 0 0 \n",
|
||
"-Entlüftungssicherung 0 0 0 0 \n",
|
||
"-Faltbalken 0 0 0 0 \n",
|
||
"... ... ... ... ... \n",
|
||
"überzogenn 0 0 0 0 \n",
|
||
"überzoggen 0 0 0 0 \n",
|
||
"übrtprüfen 0 0 0 0 \n",
|
||
"ünerziehen 0 0 0 0 \n",
|
||
"üperprüfen 0 0 0 0 \n",
|
||
"\n",
|
||
" Kontaktproblem Geschweisst vorbereiten Gelenkbolzen \\\n",
|
||
"-20C 0 0 0 0 \n",
|
||
"-Befestihgung 0 0 0 0 \n",
|
||
"-Einlaufwalze 0 0 0 0 \n",
|
||
"-Entlüftungssicherung 0 0 0 0 \n",
|
||
"-Faltbalken 0 0 0 0 \n",
|
||
"... ... ... ... ... \n",
|
||
"überzogenn 0 0 0 0 \n",
|
||
"überzoggen 0 0 0 0 \n",
|
||
"übrtprüfen 0 0 0 0 \n",
|
||
"ünerziehen 0 0 0 0 \n",
|
||
"üperprüfen 0 0 0 0 \n",
|
||
"\n",
|
||
" Silikonfass Ausbau ... Kom anlernen nah \\\n",
|
||
"-20C 0 0 ... 0 0 0 \n",
|
||
"-Befestihgung 0 0 ... 0 0 0 \n",
|
||
"-Einlaufwalze 0 0 ... 0 0 0 \n",
|
||
"-Entlüftungssicherung 0 0 ... 0 0 0 \n",
|
||
"-Faltbalken 0 0 ... 0 0 0 \n",
|
||
"... ... ... ... ... ... ... \n",
|
||
"überzogenn 0 0 ... 0 0 0 \n",
|
||
"überzoggen 0 0 ... 0 0 0 \n",
|
||
"übrtprüfen 0 0 ... 0 0 0 \n",
|
||
"ünerziehen 0 0 ... 0 0 0 \n",
|
||
"üperprüfen 0 0 ... 0 0 0 \n",
|
||
"\n",
|
||
" Begutachtung Betriebszeit paletten augetreten \\\n",
|
||
"-20C 0 0 0 0 \n",
|
||
"-Befestihgung 0 0 0 0 \n",
|
||
"-Einlaufwalze 0 0 0 0 \n",
|
||
"-Entlüftungssicherung 0 0 0 0 \n",
|
||
"-Faltbalken 0 0 0 0 \n",
|
||
"... ... ... ... ... \n",
|
||
"überzogenn 0 0 0 0 \n",
|
||
"überzoggen 0 0 0 0 \n",
|
||
"übrtprüfen 0 0 0 0 \n",
|
||
"ünerziehen 0 0 0 0 \n",
|
||
"üperprüfen 0 0 0 0 \n",
|
||
"\n",
|
||
" Antriebszahnrad Gewindereparaturset Heizventil \n",
|
||
"-20C 0 0 0 \n",
|
||
"-Befestihgung 0 0 0 \n",
|
||
"-Einlaufwalze 0 0 0 \n",
|
||
"-Entlüftungssicherung 0 0 0 \n",
|
||
"-Faltbalken 0 0 0 \n",
|
||
"... ... ... ... \n",
|
||
"überzogenn 0 0 0 \n",
|
||
"überzoggen 0 0 0 \n",
|
||
"übrtprüfen 0 0 0 \n",
|
||
"ünerziehen 0 0 0 \n",
|
||
"üperprüfen 0 0 0 \n",
|
||
"\n",
|
||
"[6946 rows x 6946 columns]"
|
||
]
|
||
},
|
||
"execution_count": 94,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"adj_mat_undir.sort_index()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 95,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"arr = adj_mat_undir.to_numpy()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 96,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"24171"
|
||
]
|
||
},
|
||
"execution_count": 96,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"np.count_nonzero(arr)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 97,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"103601"
|
||
]
|
||
},
|
||
"execution_count": 97,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"np.max(arr)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Threshold"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 110,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"WEIGHT_THRESHOLD = 30"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 111,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"arr = adj_mat_undir.to_numpy()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 112,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"arr = np.where(arr < WEIGHT_THRESHOLD, 0, arr)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 113,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"138"
|
||
]
|
||
},
|
||
"execution_count": 113,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"np.count_nonzero(arr)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 116,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"thresh_adj_mat = adj_mat_undir.copy()\n",
|
||
"thresh_adj_mat.loc[:] = arr"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 117,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>funktionsfähig</th>\n",
|
||
" <th>Zwischenbehälter</th>\n",
|
||
" <th>Ölfilter</th>\n",
|
||
" <th>Rechter</th>\n",
|
||
" <th>Kontaktproblem</th>\n",
|
||
" <th>Geschweisst</th>\n",
|
||
" <th>vorbereiten</th>\n",
|
||
" <th>Gelenkbolzen</th>\n",
|
||
" <th>Silikonfass</th>\n",
|
||
" <th>Ausbau</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>Kom</th>\n",
|
||
" <th>anlernen</th>\n",
|
||
" <th>nah</th>\n",
|
||
" <th>Begutachtung</th>\n",
|
||
" <th>Betriebszeit</th>\n",
|
||
" <th>paletten</th>\n",
|
||
" <th>augetreten</th>\n",
|
||
" <th>Antriebszahnrad</th>\n",
|
||
" <th>Gewindereparaturset</th>\n",
|
||
" <th>Heizventil</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>funktionsfähig</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Zwischenbehälter</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Ölfilter</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Rechter</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Kontaktproblem</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>paletten</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>augetreten</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Antriebszahnrad</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Gewindereparaturset</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Heizventil</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>6946 rows × 6946 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" funktionsfähig Zwischenbehälter Ölfilter Rechter \\\n",
|
||
"funktionsfähig 0 0 0 0 \n",
|
||
"Zwischenbehälter 0 0 0 0 \n",
|
||
"Ölfilter 0 0 0 0 \n",
|
||
"Rechter 0 0 0 0 \n",
|
||
"Kontaktproblem 0 0 0 0 \n",
|
||
"... ... ... ... ... \n",
|
||
"paletten 0 0 0 0 \n",
|
||
"augetreten 0 0 0 0 \n",
|
||
"Antriebszahnrad 0 0 0 0 \n",
|
||
"Gewindereparaturset 0 0 0 0 \n",
|
||
"Heizventil 0 0 0 0 \n",
|
||
"\n",
|
||
" Kontaktproblem Geschweisst vorbereiten Gelenkbolzen \\\n",
|
||
"funktionsfähig 0 0 0 0 \n",
|
||
"Zwischenbehälter 0 0 0 0 \n",
|
||
"Ölfilter 0 0 0 0 \n",
|
||
"Rechter 0 0 0 0 \n",
|
||
"Kontaktproblem 0 0 0 0 \n",
|
||
"... ... ... ... ... \n",
|
||
"paletten 0 0 0 0 \n",
|
||
"augetreten 0 0 0 0 \n",
|
||
"Antriebszahnrad 0 0 0 0 \n",
|
||
"Gewindereparaturset 0 0 0 0 \n",
|
||
"Heizventil 0 0 0 0 \n",
|
||
"\n",
|
||
" Silikonfass Ausbau ... Kom anlernen nah \\\n",
|
||
"funktionsfähig 0 0 ... 0 0 0 \n",
|
||
"Zwischenbehälter 0 0 ... 0 0 0 \n",
|
||
"Ölfilter 0 0 ... 0 0 0 \n",
|
||
"Rechter 0 0 ... 0 0 0 \n",
|
||
"Kontaktproblem 0 0 ... 0 0 0 \n",
|
||
"... ... ... ... ... ... ... \n",
|
||
"paletten 0 0 ... 0 0 0 \n",
|
||
"augetreten 0 0 ... 0 0 0 \n",
|
||
"Antriebszahnrad 0 0 ... 0 0 0 \n",
|
||
"Gewindereparaturset 0 0 ... 0 0 0 \n",
|
||
"Heizventil 0 0 ... 0 0 0 \n",
|
||
"\n",
|
||
" Begutachtung Betriebszeit paletten augetreten \\\n",
|
||
"funktionsfähig 0 0 0 0 \n",
|
||
"Zwischenbehälter 0 0 0 0 \n",
|
||
"Ölfilter 0 0 0 0 \n",
|
||
"Rechter 0 0 0 0 \n",
|
||
"Kontaktproblem 0 0 0 0 \n",
|
||
"... ... ... ... ... \n",
|
||
"paletten 0 0 0 0 \n",
|
||
"augetreten 0 0 0 0 \n",
|
||
"Antriebszahnrad 0 0 0 0 \n",
|
||
"Gewindereparaturset 0 0 0 0 \n",
|
||
"Heizventil 0 0 0 0 \n",
|
||
"\n",
|
||
" Antriebszahnrad Gewindereparaturset Heizventil \n",
|
||
"funktionsfähig 0 0 0 \n",
|
||
"Zwischenbehälter 0 0 0 \n",
|
||
"Ölfilter 0 0 0 \n",
|
||
"Rechter 0 0 0 \n",
|
||
"Kontaktproblem 0 0 0 \n",
|
||
"... ... ... ... \n",
|
||
"paletten 0 0 0 \n",
|
||
"augetreten 0 0 0 \n",
|
||
"Antriebszahnrad 0 0 0 \n",
|
||
"Gewindereparaturset 0 0 0 \n",
|
||
"Heizventil 0 0 0 \n",
|
||
"\n",
|
||
"[6946 rows x 6946 columns]"
|
||
]
|
||
},
|
||
"execution_count": 117,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"thresh_adj_mat"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 118,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"ADJ_MAT_PATH_CSV = f'./Graphanalyse/adj_mat_thresh_{feature}_{WEIGHT_THRESHOLD}.csv'\n",
|
||
"thresh_adj_mat.to_csv(path_or_buf=ADJ_MAT_PATH_CSV, encoding='cp1252', sep=';')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"---\n",
|
||
"# **Zusatz**\n",
|
||
"\n",
|
||
"#### **Analysiere beispielhaft Eintrag mit meisten Duplikaten**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 64,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Anzahl Einträge mit gewählter Beschreibung: 47689\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"crit = uni_descr[171]\n",
|
||
"filt = wo_duplicates['VorgangsBeschreibung'] == crit\n",
|
||
"temp = wo_duplicates[filt]\n",
|
||
"print(f\"Anzahl Einträge mit gewählter Beschreibung: {len(temp)}\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 65,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>VorgangsID</th>\n",
|
||
" <th>ObjektID</th>\n",
|
||
" <th>HObjektText</th>\n",
|
||
" <th>ObjektArtID</th>\n",
|
||
" <th>ObjektArtText</th>\n",
|
||
" <th>VorgangsTypID</th>\n",
|
||
" <th>VorgangsTypName</th>\n",
|
||
" <th>VorgangsDatum</th>\n",
|
||
" <th>VorgangsStatusId</th>\n",
|
||
" <th>VorgangsPrioritaet</th>\n",
|
||
" <th>VorgangsBeschreibung</th>\n",
|
||
" <th>VorgangsOrt</th>\n",
|
||
" <th>VorgangsArtText</th>\n",
|
||
" <th>ErledigungsDatum</th>\n",
|
||
" <th>ErledigungsArtText</th>\n",
|
||
" <th>ErledigungsBeschreibung</th>\n",
|
||
" <th>MPMelderArbeitsplatz</th>\n",
|
||
" <th>MPAbteilungBezeichnung</th>\n",
|
||
" <th>Arbeitsbeginn</th>\n",
|
||
" <th>ErstellungsDatum</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>288</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>187</td>\n",
|
||
" <td>246, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>289</th>\n",
|
||
" <td>152507</td>\n",
|
||
" <td>177</td>\n",
|
||
" <td>204 S SI , Webmaschine, DL 280 EMS Breite 220</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Luft-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-09</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-09</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-04-09</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>318</th>\n",
|
||
" <td>255972</td>\n",
|
||
" <td>249</td>\n",
|
||
" <td>203 C S SI, Webmaschine, DL 280 EMS Breite 220</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Luft-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-07-30</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-07-30</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-07-30</td>\n",
|
||
" <td>2022-04-28</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>319</th>\n",
|
||
" <td>255977</td>\n",
|
||
" <td>249</td>\n",
|
||
" <td>203 C S SI, Webmaschine, DL 280 EMS Breite 220</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Luft-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-08-04</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-08-04</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-08-04</td>\n",
|
||
" <td>2022-04-28</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>340</th>\n",
|
||
" <td>267942</td>\n",
|
||
" <td>187</td>\n",
|
||
" <td>246, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-08-07</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-08-07</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-08-07</td>\n",
|
||
" <td>2022-08-05</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" VorgangsID ObjektID HObjektText \\\n",
|
||
"288 155717 187 246, Webmaschine Jacquard, \n",
|
||
"289 152507 177 204 S SI , Webmaschine, DL 280 EMS Breite 220 \n",
|
||
"318 255972 249 203 C S SI, Webmaschine, DL 280 EMS Breite 220 \n",
|
||
"319 255977 249 203 C S SI, Webmaschine, DL 280 EMS Breite 220 \n",
|
||
"340 267942 187 246, Webmaschine Jacquard, \n",
|
||
"\n",
|
||
" ObjektArtID ObjektArtText VorgangsTypID VorgangsTypName \\\n",
|
||
"288 6 Jacquard-Webmaschine 1 Wartung \n",
|
||
"289 3 Luft-Webmaschine 1 Wartung \n",
|
||
"318 3 Luft-Webmaschine 1 Wartung \n",
|
||
"319 3 Luft-Webmaschine 1 Wartung \n",
|
||
"340 6 Jacquard-Webmaschine 1 Wartung \n",
|
||
"\n",
|
||
" VorgangsDatum VorgangsStatusId VorgangsPrioritaet \\\n",
|
||
"288 2022-04-01 5 0 \n",
|
||
"289 2022-04-09 5 0 \n",
|
||
"318 2022-07-30 5 0 \n",
|
||
"319 2022-08-04 5 0 \n",
|
||
"340 2022-08-07 5 0 \n",
|
||
"\n",
|
||
" VorgangsBeschreibung VorgangsOrt \\\n",
|
||
"288 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"289 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"318 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"319 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"340 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"\n",
|
||
" VorgangsArtText ErledigungsDatum \\\n",
|
||
"288 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"289 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-09 \n",
|
||
"318 Tägliche Interne Wartungstätigkeiten Weberei 2022-07-30 \n",
|
||
"319 Tägliche Interne Wartungstätigkeiten Weberei 2022-08-04 \n",
|
||
"340 Tägliche Interne Wartungstätigkeiten Weberei 2022-08-07 \n",
|
||
"\n",
|
||
" ErledigungsArtText \\\n",
|
||
"288 Intern UTT - Sichtkontrolle \n",
|
||
"289 Intern UTT - Sichtkontrolle \n",
|
||
"318 Intern UTT - Sichtkontrolle \n",
|
||
"319 Intern UTT - Sichtkontrolle \n",
|
||
"340 Intern UTT - Sichtkontrolle \n",
|
||
"\n",
|
||
" ErledigungsBeschreibung MPMelderArbeitsplatz \\\n",
|
||
"288 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"289 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"318 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"319 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"340 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"\n",
|
||
" MPAbteilungBezeichnung Arbeitsbeginn ErstellungsDatum \n",
|
||
"288 NaN 2022-04-01 2022-02-17 \n",
|
||
"289 NaN 2022-04-09 2022-02-17 \n",
|
||
"318 NaN 2022-07-30 2022-04-28 \n",
|
||
"319 NaN 2022-08-04 2022-04-28 \n",
|
||
"340 NaN 2022-08-07 2022-08-05 "
|
||
]
|
||
},
|
||
"execution_count": 65,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp.head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 66,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# schaue welche Merkmale abweichend sind\n",
|
||
"analyse_columns = ['ObjektID', 'VorgangsTypID', 'VorgangsTypName']"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"ObjektID"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 67,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([ 187, 177, 249, 2654, 1792, 272, 271, 270, 269, 268, 186,\n",
|
||
" 178, 179, 2317, 2318, 2473, 2559, 1244, 240, 241, 180, 220,\n",
|
||
" 221, 222, 223, 224, 961, 962, 2166, 3212, 267, 266, 181,\n",
|
||
" 182, 213, 214, 174, 175, 176, 156, 157, 158, 247, 248,\n",
|
||
" 183, 265, 278, 1793, 1794, 218, 217, 219, 215, 216, 2319,\n",
|
||
" 2320, 228, 184, 152, 153, 2165, 154, 155, 159, 167, 168,\n",
|
||
" 169, 2313, 2314, 2315, 2316, 212, 211, 160, 161, 162, 164,\n",
|
||
" 165, 166, 264, 273, 274, 277, 276, 275, 279, 280, 281,\n",
|
||
" 282, 283, 242, 243, 244, 245, 246, 225, 227, 229, 170,\n",
|
||
" 171, 172, 173, 230, 231, 3213, 3211, 3214], dtype=int64)"
|
||
]
|
||
},
|
||
"execution_count": 67,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp['ObjektID'].unique()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 68,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"filt = temp['ObjektID'] == 2318\n",
|
||
"temp_fil1 = temp[filt]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 69,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>VorgangsID</th>\n",
|
||
" <th>ObjektID</th>\n",
|
||
" <th>HObjektText</th>\n",
|
||
" <th>ObjektArtID</th>\n",
|
||
" <th>ObjektArtText</th>\n",
|
||
" <th>VorgangsTypID</th>\n",
|
||
" <th>VorgangsTypName</th>\n",
|
||
" <th>VorgangsDatum</th>\n",
|
||
" <th>VorgangsStatusId</th>\n",
|
||
" <th>VorgangsPrioritaet</th>\n",
|
||
" <th>VorgangsBeschreibung</th>\n",
|
||
" <th>VorgangsOrt</th>\n",
|
||
" <th>VorgangsArtText</th>\n",
|
||
" <th>ErledigungsDatum</th>\n",
|
||
" <th>ErledigungsArtText</th>\n",
|
||
" <th>ErledigungsBeschreibung</th>\n",
|
||
" <th>MPMelderArbeitsplatz</th>\n",
|
||
" <th>MPAbteilungBezeichnung</th>\n",
|
||
" <th>Arbeitsbeginn</th>\n",
|
||
" <th>ErstellungsDatum</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>878</th>\n",
|
||
" <td>269743</td>\n",
|
||
" <td>2318</td>\n",
|
||
" <td>A067, Webmaschine, DL 280 EMS Breite 280</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Luft-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-10-31</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-10-31</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-10-31</td>\n",
|
||
" <td>2022-08-05</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6099</th>\n",
|
||
" <td>152490</td>\n",
|
||
" <td>2318</td>\n",
|
||
" <td>A067, Webmaschine, DL 280 EMS Breite 280</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Luft-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-03-24</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-03-24</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-03-24</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>13905</th>\n",
|
||
" <td>152476</td>\n",
|
||
" <td>2318</td>\n",
|
||
" <td>A067, Webmaschine, DL 280 EMS Breite 280</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Luft-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-03-10</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-03-10</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-03-10</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14019</th>\n",
|
||
" <td>248301</td>\n",
|
||
" <td>2318</td>\n",
|
||
" <td>A067, Webmaschine, DL 280 EMS Breite 280</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Luft-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-28</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-28</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-04-28</td>\n",
|
||
" <td>2022-04-14</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14211</th>\n",
|
||
" <td>254914</td>\n",
|
||
" <td>2318</td>\n",
|
||
" <td>A067, Webmaschine, DL 280 EMS Breite 280</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Luft-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-05-19</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-05-19</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-05-19</td>\n",
|
||
" <td>2022-04-28</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" VorgangsID ObjektID HObjektText \\\n",
|
||
"878 269743 2318 A067, Webmaschine, DL 280 EMS Breite 280 \n",
|
||
"6099 152490 2318 A067, Webmaschine, DL 280 EMS Breite 280 \n",
|
||
"13905 152476 2318 A067, Webmaschine, DL 280 EMS Breite 280 \n",
|
||
"14019 248301 2318 A067, Webmaschine, DL 280 EMS Breite 280 \n",
|
||
"14211 254914 2318 A067, Webmaschine, DL 280 EMS Breite 280 \n",
|
||
"\n",
|
||
" ObjektArtID ObjektArtText VorgangsTypID VorgangsTypName \\\n",
|
||
"878 3 Luft-Webmaschine 1 Wartung \n",
|
||
"6099 3 Luft-Webmaschine 1 Wartung \n",
|
||
"13905 3 Luft-Webmaschine 1 Wartung \n",
|
||
"14019 3 Luft-Webmaschine 1 Wartung \n",
|
||
"14211 3 Luft-Webmaschine 1 Wartung \n",
|
||
"\n",
|
||
" VorgangsDatum VorgangsStatusId VorgangsPrioritaet \\\n",
|
||
"878 2022-10-31 5 0 \n",
|
||
"6099 2022-03-24 5 0 \n",
|
||
"13905 2022-03-10 5 0 \n",
|
||
"14019 2022-04-28 5 0 \n",
|
||
"14211 2022-05-19 5 0 \n",
|
||
"\n",
|
||
" VorgangsBeschreibung VorgangsOrt \\\n",
|
||
"878 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"6099 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"13905 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"14019 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"14211 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"\n",
|
||
" VorgangsArtText ErledigungsDatum \\\n",
|
||
"878 Tägliche Interne Wartungstätigkeiten Weberei 2022-10-31 \n",
|
||
"6099 Tägliche Interne Wartungstätigkeiten Weberei 2022-03-24 \n",
|
||
"13905 Tägliche Interne Wartungstätigkeiten Weberei 2022-03-10 \n",
|
||
"14019 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-28 \n",
|
||
"14211 Tägliche Interne Wartungstätigkeiten Weberei 2022-05-19 \n",
|
||
"\n",
|
||
" ErledigungsArtText \\\n",
|
||
"878 Intern UTT - Sichtkontrolle \n",
|
||
"6099 Intern UTT - Sichtkontrolle \n",
|
||
"13905 Intern UTT - Sichtkontrolle \n",
|
||
"14019 Intern UTT - Sichtkontrolle \n",
|
||
"14211 Intern UTT - Sichtkontrolle \n",
|
||
"\n",
|
||
" ErledigungsBeschreibung MPMelderArbeitsplatz \\\n",
|
||
"878 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"6099 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"13905 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"14019 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"14211 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"\n",
|
||
" MPAbteilungBezeichnung Arbeitsbeginn ErstellungsDatum \n",
|
||
"878 NaN 2022-10-31 2022-08-05 \n",
|
||
"6099 NaN 2022-03-24 2022-02-17 \n",
|
||
"13905 NaN 2022-03-10 2022-02-17 \n",
|
||
"14019 NaN 2022-04-28 2022-04-14 \n",
|
||
"14211 NaN 2022-05-19 2022-04-28 "
|
||
]
|
||
},
|
||
"execution_count": 69,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp_fil1.head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 70,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"<DatetimeArray>\n",
|
||
"['2022-10-31 00:00:00', '2022-03-24 00:00:00', '2022-03-10 00:00:00',\n",
|
||
" '2022-04-28 00:00:00', '2022-05-19 00:00:00', '2022-04-09 00:00:00',\n",
|
||
" '2022-04-21 00:00:00', '2022-06-11 00:00:00', '2022-05-12 00:00:00',\n",
|
||
" '2022-04-23 00:00:00',\n",
|
||
" ...\n",
|
||
" '2022-10-28 00:00:00', '2022-07-06 00:00:00', '2023-06-14 00:00:00',\n",
|
||
" '2022-10-29 00:00:00', '2022-07-07 00:00:00', '2023-06-15 00:00:00',\n",
|
||
" '2022-05-05 00:00:00', '2022-10-30 00:00:00', '2022-07-08 00:00:00',\n",
|
||
" '2022-10-19 00:00:00']\n",
|
||
"Length: 462, dtype: datetime64[ns]"
|
||
]
|
||
},
|
||
"execution_count": 70,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp_fil1['VorgangsDatum'].unique()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 71,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"462"
|
||
]
|
||
},
|
||
"execution_count": 71,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"len(temp_fil1)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"VorgangsID"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Anzahl einzigartiger VorgangsID 1855 mit Anteil am Gesamtdatensatz 3.89 %\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"uni_VorgangsID = temp['VorgangsID'].unique()\n",
|
||
"num_uni_VorgangsID = len(uni_VorgangsID)\n",
|
||
"print(f'Anzahl einzigartiger VorgangsID {num_uni_VorgangsID} mit Anteil am Gesamtdatensatz {num_uni_VorgangsID / len(temp) * 100:.2f} %')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"155717"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"uni_VorgangsID[0]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 50,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"filt = temp['VorgangsID'] == uni_VorgangsID[0]\n",
|
||
"temp_fil1 = temp[filt]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 51,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>VorgangsID</th>\n",
|
||
" <th>ObjektID</th>\n",
|
||
" <th>HObjektText</th>\n",
|
||
" <th>ObjektArtID</th>\n",
|
||
" <th>ObjektArtText</th>\n",
|
||
" <th>VorgangsTypID</th>\n",
|
||
" <th>VorgangsTypName</th>\n",
|
||
" <th>VorgangsDatum</th>\n",
|
||
" <th>VorgangsStatusId</th>\n",
|
||
" <th>VorgangsPrioritaet</th>\n",
|
||
" <th>VorgangsBeschreibung</th>\n",
|
||
" <th>VorgangsOrt</th>\n",
|
||
" <th>VorgangsArtText</th>\n",
|
||
" <th>ErledigungsDatum</th>\n",
|
||
" <th>ErledigungsArtText</th>\n",
|
||
" <th>ErledigungsBeschreibung</th>\n",
|
||
" <th>MPMelderArbeitsplatz</th>\n",
|
||
" <th>MPAbteilungBezeichnung</th>\n",
|
||
" <th>Arbeitsbeginn</th>\n",
|
||
" <th>ErstellungsDatum</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>288</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>187</td>\n",
|
||
" <td>246, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2718</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>1792</td>\n",
|
||
" <td>A057, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2719</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>186</td>\n",
|
||
" <td>245 J, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2720</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>2473</td>\n",
|
||
" <td>A056, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5504</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>2559</td>\n",
|
||
" <td>A070, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5505</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>961</td>\n",
|
||
" <td>A054, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5506</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>962</td>\n",
|
||
" <td>A055, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5507</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>2166</td>\n",
|
||
" <td>A061, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5508</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>1793</td>\n",
|
||
" <td>A058, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5509</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>1794</td>\n",
|
||
" <td>A059, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8294</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>2165</td>\n",
|
||
" <td>A060, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" VorgangsID ObjektID HObjektText ObjektArtID \\\n",
|
||
"288 155717 187 246, Webmaschine Jacquard, 6 \n",
|
||
"2718 155717 1792 A057, Webmaschine Jacquard, 6 \n",
|
||
"2719 155717 186 245 J, Webmaschine Jacquard, 6 \n",
|
||
"2720 155717 2473 A056, Webmaschine Jacquard, 6 \n",
|
||
"5504 155717 2559 A070, Webmaschine Jacquard, 6 \n",
|
||
"5505 155717 961 A054, Webmaschine Jacquard, 6 \n",
|
||
"5506 155717 962 A055, Webmaschine Jacquard, 6 \n",
|
||
"5507 155717 2166 A061, Webmaschine Jacquard, 6 \n",
|
||
"5508 155717 1793 A058, Webmaschine Jacquard, 6 \n",
|
||
"5509 155717 1794 A059, Webmaschine Jacquard, 6 \n",
|
||
"8294 155717 2165 A060, Webmaschine Jacquard, 6 \n",
|
||
"\n",
|
||
" ObjektArtText VorgangsTypID VorgangsTypName VorgangsDatum \\\n",
|
||
"288 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"2718 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"2719 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"2720 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"5504 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"5505 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"5506 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"5507 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"5508 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"5509 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"8294 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"\n",
|
||
" VorgangsStatusId VorgangsPrioritaet \\\n",
|
||
"288 5 0 \n",
|
||
"2718 5 0 \n",
|
||
"2719 5 0 \n",
|
||
"2720 5 0 \n",
|
||
"5504 5 0 \n",
|
||
"5505 5 0 \n",
|
||
"5506 5 0 \n",
|
||
"5507 5 0 \n",
|
||
"5508 5 0 \n",
|
||
"5509 5 0 \n",
|
||
"8294 5 0 \n",
|
||
"\n",
|
||
" VorgangsBeschreibung VorgangsOrt \\\n",
|
||
"288 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"2718 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"2719 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"2720 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"5504 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"5505 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"5506 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"5507 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"5508 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"5509 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"8294 Tägliche Wartungstätigkeiten nach Vorgabe des ... NaN \n",
|
||
"\n",
|
||
" VorgangsArtText ErledigungsDatum \\\n",
|
||
"288 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"2718 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"2719 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"2720 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"5504 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"5505 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"5506 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"5507 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"5508 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"5509 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"8294 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"\n",
|
||
" ErledigungsArtText \\\n",
|
||
"288 Intern UTT - Sichtkontrolle \n",
|
||
"2718 Intern UTT - Sichtkontrolle \n",
|
||
"2719 Intern UTT - Sichtkontrolle \n",
|
||
"2720 Intern UTT - Sichtkontrolle \n",
|
||
"5504 Intern UTT - Sichtkontrolle \n",
|
||
"5505 Intern UTT - Sichtkontrolle \n",
|
||
"5506 Intern UTT - Sichtkontrolle \n",
|
||
"5507 Intern UTT - Sichtkontrolle \n",
|
||
"5508 Intern UTT - Sichtkontrolle \n",
|
||
"5509 Intern UTT - Sichtkontrolle \n",
|
||
"8294 Intern UTT - Sichtkontrolle \n",
|
||
"\n",
|
||
" ErledigungsBeschreibung MPMelderArbeitsplatz \\\n",
|
||
"288 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"2718 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"2719 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"2720 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"5504 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"5505 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"5506 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"5507 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"5508 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"5509 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"8294 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... NaN \n",
|
||
"\n",
|
||
" MPAbteilungBezeichnung Arbeitsbeginn ErstellungsDatum \n",
|
||
"288 NaN 2022-04-01 2022-02-17 \n",
|
||
"2718 NaN 2022-04-01 2022-02-17 \n",
|
||
"2719 NaN 2022-04-01 2022-02-17 \n",
|
||
"2720 NaN 2022-04-01 2022-02-17 \n",
|
||
"5504 NaN 2022-04-01 2022-02-17 \n",
|
||
"5505 NaN 2022-04-01 2022-02-17 \n",
|
||
"5506 NaN 2022-04-01 2022-02-17 \n",
|
||
"5507 NaN 2022-04-01 2022-02-17 \n",
|
||
"5508 NaN 2022-04-01 2022-02-17 \n",
|
||
"5509 NaN 2022-04-01 2022-02-17 \n",
|
||
"8294 NaN 2022-04-01 2022-02-17 "
|
||
]
|
||
},
|
||
"execution_count": 51,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp_fil1"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 63,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Anzahl Einträge mit gewählter VorgangsID: 11\n",
|
||
"Anzahl einzigartiger ObjektIDs darunter: 11\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"temp_fil2 = temp_fil1.fillna(value=False)\n",
|
||
"print(f'Anzahl Einträge mit gewählter VorgangsID: {len(temp_fil2)}')\n",
|
||
"uni_obj_id = len(temp_fil2['ObjektID'].unique())\n",
|
||
"print(f'Anzahl einzigartiger ObjektIDs darunter: {uni_obj_id}')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 72,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([ 187, 1792, 186, 2473, 2559, 961, 962, 2166, 1793, 1794, 2165],\n",
|
||
" dtype=int64)"
|
||
]
|
||
},
|
||
"execution_count": 72,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp_fil2['ObjektID'].unique()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 55,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>VorgangsID</th>\n",
|
||
" <th>ObjektID</th>\n",
|
||
" <th>HObjektText</th>\n",
|
||
" <th>ObjektArtID</th>\n",
|
||
" <th>ObjektArtText</th>\n",
|
||
" <th>VorgangsTypID</th>\n",
|
||
" <th>VorgangsTypName</th>\n",
|
||
" <th>VorgangsDatum</th>\n",
|
||
" <th>VorgangsStatusId</th>\n",
|
||
" <th>VorgangsPrioritaet</th>\n",
|
||
" <th>VorgangsBeschreibung</th>\n",
|
||
" <th>VorgangsOrt</th>\n",
|
||
" <th>VorgangsArtText</th>\n",
|
||
" <th>ErledigungsDatum</th>\n",
|
||
" <th>ErledigungsArtText</th>\n",
|
||
" <th>ErledigungsBeschreibung</th>\n",
|
||
" <th>MPMelderArbeitsplatz</th>\n",
|
||
" <th>MPAbteilungBezeichnung</th>\n",
|
||
" <th>Arbeitsbeginn</th>\n",
|
||
" <th>ErstellungsDatum</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>288</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>187</td>\n",
|
||
" <td>246, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2718</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>1792</td>\n",
|
||
" <td>A057, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2719</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>186</td>\n",
|
||
" <td>245 J, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2720</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>2473</td>\n",
|
||
" <td>A056, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5504</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>2559</td>\n",
|
||
" <td>A070, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5505</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>961</td>\n",
|
||
" <td>A054, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5506</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>962</td>\n",
|
||
" <td>A055, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5507</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>2166</td>\n",
|
||
" <td>A061, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5508</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>1793</td>\n",
|
||
" <td>A058, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5509</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>1794</td>\n",
|
||
" <td>A059, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8294</th>\n",
|
||
" <td>155717</td>\n",
|
||
" <td>2165</td>\n",
|
||
" <td>A060, Webmaschine Jacquard,</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>Jacquard-Webmaschine</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Wartung</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Tägliche Wartungstätigkeiten nach Vorgabe des ...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>Tägliche Interne Wartungstätigkeiten Weberei</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>Intern UTT - Sichtkontrolle</td>\n",
|
||
" <td>Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten...</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>2022-04-01</td>\n",
|
||
" <td>2022-02-17</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" VorgangsID ObjektID HObjektText ObjektArtID \\\n",
|
||
"288 155717 187 246, Webmaschine Jacquard, 6 \n",
|
||
"2718 155717 1792 A057, Webmaschine Jacquard, 6 \n",
|
||
"2719 155717 186 245 J, Webmaschine Jacquard, 6 \n",
|
||
"2720 155717 2473 A056, Webmaschine Jacquard, 6 \n",
|
||
"5504 155717 2559 A070, Webmaschine Jacquard, 6 \n",
|
||
"5505 155717 961 A054, Webmaschine Jacquard, 6 \n",
|
||
"5506 155717 962 A055, Webmaschine Jacquard, 6 \n",
|
||
"5507 155717 2166 A061, Webmaschine Jacquard, 6 \n",
|
||
"5508 155717 1793 A058, Webmaschine Jacquard, 6 \n",
|
||
"5509 155717 1794 A059, Webmaschine Jacquard, 6 \n",
|
||
"8294 155717 2165 A060, Webmaschine Jacquard, 6 \n",
|
||
"\n",
|
||
" ObjektArtText VorgangsTypID VorgangsTypName VorgangsDatum \\\n",
|
||
"288 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"2718 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"2719 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"2720 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"5504 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"5505 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"5506 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"5507 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"5508 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"5509 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"8294 Jacquard-Webmaschine 1 Wartung 2022-04-01 \n",
|
||
"\n",
|
||
" VorgangsStatusId VorgangsPrioritaet \\\n",
|
||
"288 5 0 \n",
|
||
"2718 5 0 \n",
|
||
"2719 5 0 \n",
|
||
"2720 5 0 \n",
|
||
"5504 5 0 \n",
|
||
"5505 5 0 \n",
|
||
"5506 5 0 \n",
|
||
"5507 5 0 \n",
|
||
"5508 5 0 \n",
|
||
"5509 5 0 \n",
|
||
"8294 5 0 \n",
|
||
"\n",
|
||
" VorgangsBeschreibung VorgangsOrt \\\n",
|
||
"288 Tägliche Wartungstätigkeiten nach Vorgabe des ... False \n",
|
||
"2718 Tägliche Wartungstätigkeiten nach Vorgabe des ... False \n",
|
||
"2719 Tägliche Wartungstätigkeiten nach Vorgabe des ... False \n",
|
||
"2720 Tägliche Wartungstätigkeiten nach Vorgabe des ... False \n",
|
||
"5504 Tägliche Wartungstätigkeiten nach Vorgabe des ... False \n",
|
||
"5505 Tägliche Wartungstätigkeiten nach Vorgabe des ... False \n",
|
||
"5506 Tägliche Wartungstätigkeiten nach Vorgabe des ... False \n",
|
||
"5507 Tägliche Wartungstätigkeiten nach Vorgabe des ... False \n",
|
||
"5508 Tägliche Wartungstätigkeiten nach Vorgabe des ... False \n",
|
||
"5509 Tägliche Wartungstätigkeiten nach Vorgabe des ... False \n",
|
||
"8294 Tägliche Wartungstätigkeiten nach Vorgabe des ... False \n",
|
||
"\n",
|
||
" VorgangsArtText ErledigungsDatum \\\n",
|
||
"288 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"2718 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"2719 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"2720 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"5504 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"5505 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"5506 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"5507 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"5508 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"5509 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"8294 Tägliche Interne Wartungstätigkeiten Weberei 2022-04-01 \n",
|
||
"\n",
|
||
" ErledigungsArtText \\\n",
|
||
"288 Intern UTT - Sichtkontrolle \n",
|
||
"2718 Intern UTT - Sichtkontrolle \n",
|
||
"2719 Intern UTT - Sichtkontrolle \n",
|
||
"2720 Intern UTT - Sichtkontrolle \n",
|
||
"5504 Intern UTT - Sichtkontrolle \n",
|
||
"5505 Intern UTT - Sichtkontrolle \n",
|
||
"5506 Intern UTT - Sichtkontrolle \n",
|
||
"5507 Intern UTT - Sichtkontrolle \n",
|
||
"5508 Intern UTT - Sichtkontrolle \n",
|
||
"5509 Intern UTT - Sichtkontrolle \n",
|
||
"8294 Intern UTT - Sichtkontrolle \n",
|
||
"\n",
|
||
" ErledigungsBeschreibung MPMelderArbeitsplatz \\\n",
|
||
"288 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... False \n",
|
||
"2718 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... False \n",
|
||
"2719 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... False \n",
|
||
"2720 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... False \n",
|
||
"5504 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... False \n",
|
||
"5505 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... False \n",
|
||
"5506 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... False \n",
|
||
"5507 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... False \n",
|
||
"5508 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... False \n",
|
||
"5509 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... False \n",
|
||
"8294 Sichtkontrolle durchgeführt\\n\\nAuffälligkeiten... False \n",
|
||
"\n",
|
||
" MPAbteilungBezeichnung Arbeitsbeginn ErstellungsDatum \n",
|
||
"288 False 2022-04-01 2022-02-17 \n",
|
||
"2718 False 2022-04-01 2022-02-17 \n",
|
||
"2719 False 2022-04-01 2022-02-17 \n",
|
||
"2720 False 2022-04-01 2022-02-17 \n",
|
||
"5504 False 2022-04-01 2022-02-17 \n",
|
||
"5505 False 2022-04-01 2022-02-17 \n",
|
||
"5506 False 2022-04-01 2022-02-17 \n",
|
||
"5507 False 2022-04-01 2022-02-17 \n",
|
||
"5508 False 2022-04-01 2022-02-17 \n",
|
||
"5509 False 2022-04-01 2022-02-17 \n",
|
||
"8294 False 2022-04-01 2022-02-17 "
|
||
]
|
||
},
|
||
"execution_count": 55,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp_fil2"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"*Frage: Können einem Vorgang mehrere ObjektIDs zugeordnet werden? Wenn ja, warum dann unterschiedliche Erledigungsdaten?*"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**Länge der Beschreibungen**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 73,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"descriptions = descriptions.to_frame()\n",
|
||
"descriptions['length_description'] = descriptions.applymap(func=lambda x: len(x))\n",
|
||
"descriptions = descriptions.sort_values(by=['length_description'], ascending=False)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 74,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"count 124008.000000\n",
|
||
"mean 70.351751\n",
|
||
"std 53.080901\n",
|
||
"min 1.000000\n",
|
||
"25% 66.000000\n",
|
||
"50% 66.000000\n",
|
||
"75% 67.000000\n",
|
||
"max 3137.000000\n",
|
||
"Name: length_description, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 74,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# stats\n",
|
||
"len_descr = descriptions['length_description']\n",
|
||
"len_descr.describe()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 75,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>VorgangsBeschreibung</th>\n",
|
||
" <th>length_description</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>8704</th>\n",
|
||
" <td>Vorgaben aus Held Wartungsplan\\n\\nLC-X-Achse /...</td>\n",
|
||
" <td>3137</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7826</th>\n",
|
||
" <td>Vorgaben aus Held Wartungsplan\\n\\nLC-X-Achse /...</td>\n",
|
||
" <td>3137</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>49779</th>\n",
|
||
" <td>Laut Wartungsvertrag (Hr.Radtke) Bestellnummer...</td>\n",
|
||
" <td>2311</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>124118</th>\n",
|
||
" <td>Laut Wartungsvertrag (Hr.Radtke) Bestellnummer...</td>\n",
|
||
" <td>2311</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14853</th>\n",
|
||
" <td>Laut Wartungsvertrag (Hr.Radtke) Bestellnummer...</td>\n",
|
||
" <td>2311</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" VorgangsBeschreibung length_description\n",
|
||
"8704 Vorgaben aus Held Wartungsplan\\n\\nLC-X-Achse /... 3137\n",
|
||
"7826 Vorgaben aus Held Wartungsplan\\n\\nLC-X-Achse /... 3137\n",
|
||
"49779 Laut Wartungsvertrag (Hr.Radtke) Bestellnummer... 2311\n",
|
||
"124118 Laut Wartungsvertrag (Hr.Radtke) Bestellnummer... 2311\n",
|
||
"14853 Laut Wartungsvertrag (Hr.Radtke) Bestellnummer... 2311"
|
||
]
|
||
},
|
||
"execution_count": 75,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"descriptions.head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 76,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>VorgangsBeschreibung</th>\n",
|
||
" <th>length_description</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>8704</th>\n",
|
||
" <td>Vorgaben aus Held Wartungsplan\\n\\nLC-X-Achse /...</td>\n",
|
||
" <td>3137</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7826</th>\n",
|
||
" <td>Vorgaben aus Held Wartungsplan\\n\\nLC-X-Achse /...</td>\n",
|
||
" <td>3137</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>49779</th>\n",
|
||
" <td>Laut Wartungsvertrag (Hr.Radtke) Bestellnummer...</td>\n",
|
||
" <td>2311</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>124118</th>\n",
|
||
" <td>Laut Wartungsvertrag (Hr.Radtke) Bestellnummer...</td>\n",
|
||
" <td>2311</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14853</th>\n",
|
||
" <td>Laut Wartungsvertrag (Hr.Radtke) Bestellnummer...</td>\n",
|
||
" <td>2311</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>13450</th>\n",
|
||
" <td></td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>13451</th>\n",
|
||
" <td></td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>29979</th>\n",
|
||
" <td></td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>13452</th>\n",
|
||
" <td></td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>21214</th>\n",
|
||
" <td>\\n</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>124008 rows × 2 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" VorgangsBeschreibung length_description\n",
|
||
"8704 Vorgaben aus Held Wartungsplan\\n\\nLC-X-Achse /... 3137\n",
|
||
"7826 Vorgaben aus Held Wartungsplan\\n\\nLC-X-Achse /... 3137\n",
|
||
"49779 Laut Wartungsvertrag (Hr.Radtke) Bestellnummer... 2311\n",
|
||
"124118 Laut Wartungsvertrag (Hr.Radtke) Bestellnummer... 2311\n",
|
||
"14853 Laut Wartungsvertrag (Hr.Radtke) Bestellnummer... 2311\n",
|
||
"... ... ...\n",
|
||
"13450 1\n",
|
||
"13451 1\n",
|
||
"29979 1\n",
|
||
"13452 1\n",
|
||
"21214 \\n 1\n",
|
||
"\n",
|
||
"[124008 rows x 2 columns]"
|
||
]
|
||
},
|
||
"execution_count": 76,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"descriptions"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3 (ipykernel)",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.11.8"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 4
|
||
}
|