From mo at globalsaassol.com Sat Oct 8 04:57:50 2022 From: mo at globalsaassol.com (Mike Oliver) Date: Sat, 8 Oct 2022 08:57:50 +0000 Subject: [scikit-learn] is Sci_kiet-Learn the right choice for my project Message-ID: Dear Sirs, I am evaluating SciKit-Learn for a new project. I am hoping to find a AI Machine Learning package that can take a large dataset of objects that have various object types and attributes. These objects are typically related to other objects, such as a server to a Wifi device, or two network routers to each other, etc. When these objects are setup data is gathered about where they are located, what settings there are, the device type, etc. With large organizations there can be thousands of these objects and tens of thousands of relationships, descriptions, settings, etc. My hope is that with machine learning we can detect when an object is missing, or configured in error, or duplicates. The question is, will SciKit-Learn help with this problem? I understand that we will have to train it to identify what to look for and then act on what was found and predicted to be the solution algorithm. Or instructions. Thanks for your help, Great looking product and already have the tutorial up and running and have installed it in my Django platform. Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: From helmrp at yahoo.com Sat Oct 8 07:34:18 2022 From: helmrp at yahoo.com (Bob Helmbold) Date: Sat, 08 Oct 2022 04:34:18 -0700 Subject: [scikit-learn] is Sci_kiet-Learn the right choice for my project In-Reply-To: References: Message-ID: Mebbe Tensor Flow would be a better fit to your needs. On 10/08/2022 1:58:03 AM, Mike Oliver wrote: Dear Sirs, ? I am evaluating SciKit-Learn for a new project.? I am hoping to find a AI Machine Learning package that can take a large dataset of objects that have various object types and attributes.? These objects are typically related to other objects, such as a server to a Wifi device, or two network routers to each other, etc.? When these objects are setup data is gathered about where they are located, what settings there are, the device type, etc. ? With large organizations there can be thousands of these objects and tens of thousands of relationships, descriptions, settings, etc.? My hope is that with machine learning we can detect when an object is missing, or configured in error, or duplicates. ? The question is, will SciKit-Learn help with this problem? I understand that we will have to train it to identify what to look for and then act on what was found and predicted to be the solution algorithm. Or instructions. ? Thanks for your help, ? Great looking product and already have the tutorial up and running and have installed it in my Django platform. ? Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: From jbbrown at kuhp.kyoto-u.ac.jp Sat Oct 8 07:35:30 2022 From: jbbrown at kuhp.kyoto-u.ac.jp (Brown J.B.) Date: Sat, 8 Oct 2022 13:35:30 +0200 Subject: [scikit-learn] is Sci_kiet-Learn the right choice for my project In-Reply-To: References: Message-ID: Dear Mike, Just my two cents about your inquiry, where I strictly a user of scikit-learn for many years. - From your description of application context, I would say that scikit-learn is perfectly fine. However, I would suggest the awareness that a monolithic model incorporating all data (as is the image TV wrongfully projects) is not a valid strategy. Stratifying data into contextually correct subgroups and then running scikit-learn, for example to estimate during development the extent of predictability, will be helpful. - Duplicate checking should be easy to use using standard python objects (set or list counting), once the context derives how the objects are vectorized/featurized. I don't see a need to force scikit-learn for that context. - Missing data could be implemented by context-specific object classes that you design, which could contain something like a __bool__() method that could tell if you if the object has all of the required data populated and configured. - Detection of errors in configuration could be either explicitly driven by logic (of the context, again something to return a bool that an object is configured correctly), or potentially could be statistically derived as outliers from the given background data distribution, in which then scikit-learn could be of help. If there are too many variates (thousands or tens of thousands) in your data that prohibit explicit logic, then scikit-learn's Random Forest algorithms might be perfectly fine and provide verification through visualization of Decision Tree rules. Hope this helps, J.B. Brown 2022?10?8?(?) 10:59 Mike Oliver : > Dear Sirs, > > > > I am evaluating SciKit-Learn for a new project. I am hoping to find a AI > Machine Learning package that can take a large dataset of objects that have > various object types and attributes. These objects are typically related > to other objects, such as a server to a Wifi device, or two network routers > to each other, etc. When these objects are setup data is gathered about > where they are located, what settings there are, the device type, etc. > > > > With large organizations there can be thousands of these objects and tens > of thousands of relationships, descriptions, settings, etc. My hope is > that with machine learning we can detect when an object is missing, or > configured in error, or duplicates. > > > > The question is, will SciKit-Learn help with this problem? I understand > that we will have to train it to identify what to look for and then act on > what was found and predicted to be the solution algorithm. Or instructions. > > > > Thanks for your help, > > > > Great looking product and already have the tutorial up and running and > have installed it in my Django platform. > > > > Mike > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bross_phobrain at sonic.net Sat Oct 8 12:46:14 2022 From: bross_phobrain at sonic.net (Bill Ross) Date: Sat, 08 Oct 2022 09:46:14 -0700 Subject: [scikit-learn] is Sci_kiet-Learn the right choice for my project In-Reply-To: References: Message-ID: > My hope is that with machine learning we can detect when an object is missing, or configured in error, or duplicates. These look like simple correctness issues that I'd address with programming. Why do you want to use a learned approach? Do you think it will be faster to develop, or have a faster runtime? Bill -- Phobrain.com On 2022-10-08 01:57, Mike Oliver wrote: > Dear Sirs, > > I am evaluating SciKit-Learn for a new project. I am hoping to find a AI Machine Learning package that can take a large dataset of objects that have various object types and attributes. These objects are typically related to other objects, such as a server to a Wifi device, or two network routers to each other, etc. When these objects are setup data is gathered about where they are located, what settings there are, the device type, etc. > > With large organizations there can be thousands of these objects and tens of thousands of relationships, descriptions, settings, etc. My hope is that with machine learning we can detect when an object is missing, or configured in error, or duplicates. > > The question is, will SciKit-Learn help with this problem? I understand that we will have to train it to identify what to look for and then act on what was found and predicted to be the solution algorithm. Or instructions. > > Thanks for your help, > > Great looking product and already have the tutorial up and running and have installed it in my Django platform. > > Mike > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From mo at globalsaassol.com Sun Oct 9 18:43:24 2022 From: mo at globalsaassol.com (Mike Oliver) Date: Sun, 9 Oct 2022 22:43:24 +0000 Subject: [scikit-learn] is Sci_kiet-Learn the right choice for my project In-Reply-To: References: Message-ID: Bill, Granted, you are correct the examples I gave are simple enough for a set of rules to process. The problem as I see it is that with the plethora of potential relationships, the rules may not be adequate. Inconsistent data is not transactional for example I can store data on a pair of objects and they pass all the rules. Then later another relationship is stored, and that passes all the rules?yet just because A is consistent with B and B is consistent with C does NOT mean A is consistent with C. Anomaly detection can be simple or it can be complex. But if 10,000 relationships have been stored and one changes I want an algorithm to emerge that we can see and turn into an action to fix it. We are also hoping that things may emerge that we did not anticipate. That may require deep learning sub-symbolic neural networks, but that is yet to be determined. And yes, with literally thousands of records flowing through the system, the delay in processing each record against every other record in a read before write model is not going to perform well. If we can use machine learning we can take that evaluation and even corrections out of that processing flow. Thanks, If scikit-learn is not a good fit for my goals, let me know. If you know a better fit, please let me know as well. Mike From: Bill Ross Sent: Sunday, October 9, 2022 12:46 AM To: Scikit-learn mailing list Cc: scikit-learn ; Mike Oliver Subject: Re: [scikit-learn] is Sci_kiet-Learn the right choice for my project [Warning]CAUTION: This email originated from outside of the organization. Do not click on links or open attachments unless you recognize the sender and know that the contents are safe. > My hope is that with machine learning we can detect when an object is missing, or configured in error, or duplicates. These look like simple correctness issues that I'd address with programming. Why do you want to use a learned approach? Do you think it will be faster to develop, or have a faster runtime? Bill -- Phobrain.com On 2022-10-08 01:57, Mike Oliver wrote: Dear Sirs, I am evaluating SciKit-Learn for a new project. I am hoping to find a AI Machine Learning package that can take a large dataset of objects that have various object types and attributes. These objects are typically related to other objects, such as a server to a Wifi device, or two network routers to each other, etc. When these objects are setup data is gathered about where they are located, what settings there are, the device type, etc. With large organizations there can be thousands of these objects and tens of thousands of relationships, descriptions, settings, etc. My hope is that with machine learning we can detect when an object is missing, or configured in error, or duplicates. The question is, will SciKit-Learn help with this problem? I understand that we will have to train it to identify what to look for and then act on what was found and predicted to be the solution algorithm. Or instructions. Thanks for your help, Great looking product and already have the tutorial up and running and have installed it in my Django platform. Mike _______________________________________________ scikit-learn mailing list scikit-learn at python.org https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomasjpfan at gmail.com Wed Oct 12 15:15:35 2022 From: thomasjpfan at gmail.com (Thomas J. Fan) Date: Wed, 12 Oct 2022 15:15:35 -0400 Subject: [scikit-learn] scikit-learn monthly developer meeting: October 31, 2022 Message-ID: Dear all, The scikit-learn developer monthly meeting will take place on October 31, 2022 at 15:00 UTC. - Video call link: https://meet.google.com/gmn-acub-mrr - Meeting notes / agenda: https://hackmd.io/0yokz72CTZSny8y3Re648Q - Local times: https://www.timeanddate.com/worldclock/meetingdetails.html?year=2022&month=10&day=31&hour=15&min=0&sec=0&p1=1440&p2=240&p3=248&p4=195&p5=179&p6=224 The goal of this meeting is to discuss ongoing development topics for the project. Everybody is welcome. As usual, please follow the code of conduct of the project: https://github.com/scikit-learn/scikit-learn/blob/main/CODE_OF_CONDUCT.md Regards, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From xxy1836 at gmail.com Sun Oct 16 23:54:20 2022 From: xxy1836 at gmail.com (Riko Naka) Date: Mon, 17 Oct 2022 11:54:20 +0800 Subject: [scikit-learn] Decision function for kernel method SVM not reproducible. Message-ID: Hi, guys, the decision function calculation problem of the kernel method SVM has been bothering me for a long time. I calculated this decision function manually, but the return result is different from the function provided in sklearn. This is my issue on github, did I calculate anything wrong? https://github.com/scikit-learn/scikit-learn/issues/24663 -------------- next part -------------- An HTML attachment was scrubbed... URL: From reshama.stat at gmail.com Thu Oct 27 15:39:58 2022 From: reshama.stat at gmail.com (Reshama Shaikh) Date: Thu, 27 Oct 2022 15:39:58 -0400 Subject: [scikit-learn] Pandas DataFrame output is now available for all sklearn transformers Message-ID: Hello, Pandas DataFrame output is now available for all sklearn transformers (in dev version 1.2)! This will make running pipelines on data frames much easier, and provides better ways to track feature names. There is a 14-minute video with examples, some more information and some FAQs answered at the end [a]. This is one of the biggest improvements in scikit-learn in a long time and we'd love your feedback! Please try out the nightly built and give it a go. We'd love to hear both about whether this helps your use cases and any bugs you find. A special thanks to the maintainers: Thomas J. Fan, Guillaume LeMaitre, Christian Lorentzen ! [a] video https://youtu.be/J4KCu9WWDTo [b] example https://scikit-learn.org/dev/auto_examples/miscellaneous/plot_set_output.html#sphx-glr-auto-examples-miscellaneous-plot-set-output-py [c] LinkedIn post https://www.linkedin.com/feed/update/urn:li:activity:6987027021608460289/?actorCompanyId=79865351 --- Reshama Shaikh she/her -------------- next part -------------- An HTML attachment was scrubbed... URL: From seralouk at hotmail.com Fri Oct 28 05:18:05 2022 From: seralouk at hotmail.com (Serafeim Loukas) Date: Fri, 28 Oct 2022 09:18:05 +0000 Subject: [scikit-learn] Discrepancy in "Feature importances with a forest of trees" documentation Message-ID: <696AE2E8-8414-458E-ACF4-3E8ADB94E4D0@hotmail.com> Dear Scikit-learn community, I have been reading some examples in https://scikit-learn.org/stable/auto_examples/ensemble/plot_forest_importances.html#feature-importance-based-on-mean-decrease-in-impurity about the permutation importance that can be assessed after fitting a tree-based model (e.g. RandomForestClassifier). However, I have noticed a discrepancy that I would like to mention. If a one-hot-encoding step is used before model fitting, the `.feature_importances_` attribute includes importances for all the levels of the transformed categorical features (e.g. for gender, we get 2 importances for males & females, respectively. When I apply the `permutation_importance` functions though, the outputs correspond to the non-transformed data. To illustrate this, I include a toy example in .py format. Best, Makis -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Untitled.py Type: text/x-python-script Size: 2410 bytes Desc: Untitled.py URL: From g.lemaitre58 at gmail.com Sat Oct 29 11:32:47 2022 From: g.lemaitre58 at gmail.com (=?UTF-8?Q?Guillaume_Lema=C3=AEtre?=) Date: Sat, 29 Oct 2022 17:32:47 +0200 Subject: [scikit-learn] [ANN] scikit-learn 1.1.3 is online! Message-ID: scikit-learn 1.1.3 is out on pypi.org and conda-forge! This bugfix release only includes fixes for compatibility with the latest SciPy release >= 1.9.2 and wheels for Python 3.11. Note that support for 32-bit Python on Windows has been dropped in this release. This is due to the fact that SciPy 1.9.2 also dropped the support for that platform. Windows users are advised to install the 64-bit version of Python instead. More details in the changelog: https://scikit-learn.org/dev/whats_new/v1.1.html#version-1-1-3 You can upgrade with pip as usual: pip install -U scikit-learn The conda-forge builds will be available shortly, which you can then install using: conda install -c conda-forge scikit-learn Thanks again to all the contributors. On behalf of the scikit-learn maintainer team. -- Guillaume Lemaitre Scikit-learn @ Inria Foundation https://glemaitre.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From etfredeluces at up.edu.ph Sun Oct 30 07:18:22 2022 From: etfredeluces at up.edu.ph (Ellarizza Fredeluces) Date: Sun, 30 Oct 2022 19:18:22 +0800 Subject: [scikit-learn] Inquiry on Genetic Algorithm Message-ID: Dear Scikit-Learn developers, First of all, thank you for your brilliant work. I would like to ask if a genetic algorithm is available in scikit-learn. I tried to search, but I only found this one . I also checked your website but there seems to be no genetic algorithm yet. Your reply will be highly appreciated. Thank you again. Sincerely, Ella -------------- next part -------------- An HTML attachment was scrubbed... URL: From olivier.grisel at ensta.org Sun Oct 30 07:18:32 2022 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Sun, 30 Oct 2022 12:18:32 +0100 Subject: [scikit-learn] [ANN] scikit-learn 1.1.3 is online! In-Reply-To: References: Message-ID: Thank you so much Guillaume for getting this release out and to Chiara for pushing forward with the Python 3.11 wheel building infrastructure update and related fixes! -- Olivier From tevang3 at gmail.com Sun Oct 30 08:18:58 2022 From: tevang3 at gmail.com (Thomas Evangelidis) Date: Sun, 30 Oct 2022 13:18:58 +0100 Subject: [scikit-learn] Inquiry on Genetic Algorithm In-Reply-To: References: Message-ID: Hi, I am not aware of any *official* scikit-learn implementation of a genetic algorithm. I program my own with DEAP, which is quite versatile: https://deap.readthedocs.io/en/master/ ~Thomas On Sun, 30 Oct 2022 at 12:19, Ellarizza Fredeluces via scikit-learn < scikit-learn at python.org> wrote: > Dear Scikit-Learn developers, > > First of all, thank you for your brilliant work. > I would like to ask if a genetic algorithm is available in scikit-learn. > I tried to search, but I only found this one > . > I also checked your website but > there seems to be no genetic algorithm yet. > > Your reply will be highly appreciated. Thank you again. > > Sincerely, > Ella > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -- ====================================================================== Dr. Thomas Evangelidis Research Scientist IOCB - Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences , Prague, Czech Republic & CEITEC - Central European Institute of Technology , Brno, Czech Republic email: tevang3 at gmail.com, Twitter: tevangelidis , LinkedIn: Thomas Evangelidis website: https://sites.google.com/site/thomasevangelidishomepage/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Sun Oct 30 08:21:21 2022 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sun, 30 Oct 2022 12:21:21 +0000 Subject: [scikit-learn] Inquiry on Genetic Algorithm In-Reply-To: References: Message-ID: GA are not a machine learning model, they are a way of minimizing a cost function, so there are probably modules that are dedicated to this elsewhere. Matthieu Le dim. 30 oct. 2022 ? 12:19, Thomas Evangelidis a ?crit : > Hi, > > I am not aware of any *official* scikit-learn implementation of a genetic > algorithm. I program my own with DEAP, which is quite versatile: > > https://deap.readthedocs.io/en/master/ > > ~Thomas > > On Sun, 30 Oct 2022 at 12:19, Ellarizza Fredeluces via scikit-learn < > scikit-learn at python.org> wrote: > >> Dear Scikit-Learn developers, >> >> First of all, thank you for your brilliant work. >> I would like to ask if a genetic algorithm is available in scikit-learn. >> I tried to search, but I only found this one >> . >> I also checked your website but >> there seems to be no genetic algorithm yet. >> >> Your reply will be highly appreciated. Thank you again. >> >> Sincerely, >> Ella >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > > > -- > > ====================================================================== > > Dr. Thomas Evangelidis > > Research Scientist > > IOCB - Institute of Organic Chemistry and Biochemistry of the Czech > Academy of Sciences , Prague, > Czech Republic > & > CEITEC - Central European Institute of Technology > , Brno, Czech Republic > > email: tevang3 at gmail.com, Twitter: tevangelidis > , LinkedIn: Thomas Evangelidis > > > website: https://sites.google.com/site/thomasevangelidishomepage/ > > > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -- Quantitative researcher, Ph.D. Blog: http://blog.audio-tk.com/ LinkedIn: http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From trev.stephens at gmail.com Sun Oct 30 08:30:53 2022 From: trev.stephens at gmail.com (Trevor Stephens) Date: Sun, 30 Oct 2022 23:30:53 +1100 Subject: [scikit-learn] Inquiry on Genetic Algorithm In-Reply-To: References: Message-ID: Not GA, but gplearn is an estimator that wraps genetic programming in a (mostly) scikit-learn compatible API. TPOT is another package that might be what you're looking for? More details on what you are doing would help point you in the right direction. On Sun, Oct 30, 2022 at 11:19 PM Thomas Evangelidis wrote: > Hi, > > I am not aware of any *official* scikit-learn implementation of a genetic > algorithm. I program my own with DEAP, which is quite versatile: > > https://deap.readthedocs.io/en/master/ > > ~Thomas > > On Sun, 30 Oct 2022 at 12:19, Ellarizza Fredeluces via scikit-learn < > scikit-learn at python.org> wrote: > >> Dear Scikit-Learn developers, >> >> First of all, thank you for your brilliant work. >> I would like to ask if a genetic algorithm is available in scikit-learn. >> I tried to search, but I only found this one >> . >> I also checked your website but >> there seems to be no genetic algorithm yet. >> >> Your reply will be highly appreciated. Thank you again. >> >> Sincerely, >> Ella >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > > > -- > > ====================================================================== > > Dr. Thomas Evangelidis > > Research Scientist > > IOCB - Institute of Organic Chemistry and Biochemistry of the Czech > Academy of Sciences , Prague, > Czech Republic > & > CEITEC - Central European Institute of Technology > , Brno, Czech Republic > > email: tevang3 at gmail.com, Twitter: tevangelidis > , LinkedIn: Thomas Evangelidis > > > website: https://sites.google.com/site/thomasevangelidishomepage/ > > > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From t3kcit at gmail.com Mon Oct 31 20:33:58 2022 From: t3kcit at gmail.com (Andreas Mueller) Date: Tue, 01 Nov 2022 00:33:58 +0000 Subject: [scikit-learn] VOTE SLEP 17 Message-ID: Hey Everybody! SLEP 17 (by Joel Nothman) introduces an __sklearn_clone__ protocol & method that allows estimators to overload what sklearn's clone function does. An implementation for this SLEP is available in PR 24568 (by Thomas Fan). Feedback to both the SLEP and implementation are welcome! If you are a scikit-learn core developer, please cast your vote in this PR . . According to our governance model , the vote will be open for a month (till December 1st), and the motion is accepted if 2/3 of the votes cast are in favor. Cheers, Andreas -------------- next part -------------- An HTML attachment was scrubbed... URL: