Accumulating Thoughts

On average we have about 6000 thoughts everyday. What if we could record, track and classify all those thoughts. And feed it into a giant neural network to make a virtual version of yourself.

Writing, speaking, drawing, tweets, vlogs and all other forms of “content” we create are conscious to some extent. However our thoughts are a constant stream. Like a fire hose. Mixed with both conscious and serendipitous thoughts.

Could this virtual self be more objective and less susceptible to emotions. Or will it also have your biases built into it. Could you have this virtual self answer calls for you. Reply to messages?

Could we use this set of thoughts to spawn other forms of virtual interactions. Perhaps chatbots. Imagine a customer support chatbot modelled after Gordon Ramsay. Hilarious.

Dunning-Kruger Effect

The Dunning-Kruger Effect is a bias where people think that they are smarter than they actually are. A person with low ability is more likely to overestimate themselves. This is due to their lack of perception of the task at hand and lack of previous experience. Both leading them to mis-calibrate there judgement. If a person has tried to shoot a basket a 1,000 times, they are more likely to judge their own skills as compared to someone who has tried 10 times.

Neural networks during learning also exhibit some similar behaviour. When a neural network is undergoing training by use a dataset, it optimize it’s own parameters to create a best guess for the problem at hand. However, if there are not enough data to learn from the network will more often fail to generalize the problem at hand. This would mean that it has failed to get an “average” of the problem at hand forcing it to misread real world data.

Everyone is susceptible to this.

Study conducted. Tests on group of students. People who scored less overestimated while competent people underestimated. This is common when we set out to learn something new. Initially with little knowledge we seem to believe that we know a lot more than what we actually do. But as we progress and see the big picture as to how hard it is or how vast the topic is we tend to doubt ourselves. Psychologists call this the valley of despair.

This slope is similar to the hype cycle. Even on a large scale, especially in the realms of new technology adoption, a large group of people tend to follow these steps.

Sparsely Activated Network

A deep learning network contains multiple learning nodes separated as layers and interconnected both within and across different layers to create a network. Typically deep learning models are trained by activating all nodes for each training input. Another way to train the network is to sparsely activate the network for each input with the help of a Switch Transformer. This would mean that only a subset of all the nodes would be active and the subset would vary depending on the input. Sparsely activated networks would have a constant computational cost despite the size of the whole network. The key feature of sparse activation is that it enable the different parts of the network to specialize in different kinds of inputs and problems. More like how the brain is. Our brain have different regions that are responsible for different cognitive functions. However, this also brings new challenges like load balancing. To avoid over training of some parts of the network and vice-versa.

Google has trained language models consisting of 1.6 Trillion parameters using this technique. The nearest model in this area was that of GPT-3, which consisted of 175 Billion parameters. This gives an idea of the leverage these models have.

Freudian Slip

A Freudian slip is an error that occurs due to an internal train of thought. It is said that this happens due to the interference of the unconscious. It is characterized as a slip because it is an indication of what you are thinking behind the scenes. Sometimes these slips can be quite revealing to you and to others. A peak into what you believe, what you wish and what you haven’t addressed yet. At other times, these slips can be a way for your mind to tell you it’s standing on something. Verbal errors are not random, they are puzzles to be solved.

If a text engine like GPT-3 is trained with a lot of text concerning to a person. Maybe all speeches of a famous personality. And the engine writes a speech posing as them. If some parts of the speech or even a sentence didn’t make sense, it would be easily discarded as an outlier. But what if that is also another indication of a Freudian slip. Can this trait be replicated in systems like these?

Context and AI

AI and ML techniques have made big leaps this year. With AlphaFold solving the protein folding problem to the GPT3 engine that can solve most text related problems using its language model. Computer vision techniques have improved as well. All these research has some drivers behind it. And usually these drivers are based on a research question, grant or a niche. But AI research in the area of understanding real-life context is hard to come by as there is no beneficiary to such a model right away. The Google Assistant is close. As the assistant tries to look at multiple sources of data in different forms like previous searches, calendar appointments, email etc. to make a better decision and provide a better reply. These kind of models require something that can account for the great randomness that is humans. The same set of calendar appointments and search inputs could still have a different meaning and level of importance for different people. Rightly judging this based on more second-order inputs like usage pattern and key strokes might be the next big step.

Protein Folding

Protein exists as a polypeptide chain consisting of amino acids linked by a peptide bond. These chains can be in random shapes. But what makes them biologically functional is the shape it takes up when they are acted upon by a ribosome. Ribosomes are found in all living cells and they perform protein synthesis. The exact folding of a protein is crucial for its function. There can be parts of the functional protein that still remains unfolded. From a physical point of view, the unfolded state has the highest entropy hence it’s natural state. There is a certain amount of energy that is required for the folding to happen and reduce its entropy. Folding can be triggered by hydrophobic reactions. Molecular chaperones are a class of proteins that aid in the correct folding of the protein. Neurogenerative diseases have been linked to the misfolding of proteins.

The process of protein folding is not stochastic. There are many variables build into this process that makes it really hard to predict the correct functional folding for a protein. It is estimated that a natural protein can have 10300 possible combinations. It is not truly random but rather a model that can replicate this process is still being developed. Understanding this fundamental biological process can help in developing new medicines and managing new environment.

Federated Learning

Federated Learning is a machine learning technique that trains algorithms with separate local samples and without exchanging them. This enables training of an algorithm using multiple devices. This is a huge plus from a data privacy and data security point of view. The basic principle is that the algorithm is trained on the locally available data and the resulting model parameters of the algorithm are then exchanged to other instances. The “other instances” can be either centralized or decentralized. Determining data characteristics from just the parameters is close to impossible. Splitting the datasets into smaller local sets counteract the bias that maybe only seen in some data sets. Smartphones use this form of learning where a central model is retrieved from the cloud. The local data produced by the smartphone (for example, usage statistics, keyboard strokes etc.) is used to update the model. The updated model is then sent back to the cloud over secure channels. This shields the raw user data from the external cloud infrastructure.

A major field where this can be used is in the digital health vertical. Starting from all the data that is harvested from wearables of consumers to data from hospitals and insurances. It fits into the criteria of having a very dispersed data and can still fulfill legislation like GDPR.