Short Overview:

SOUNDVISION (Max Schweder)

First Presentation

type of project: fellow research project

published: 2020

by: Max Schweder

Core Programming and Mentoring: Chikashi Miyama



license(s): MIT


cued by: humans / machines

dev-blog: blog


Visualization of sound is always interpretation, translation, transformation from one sense to another. Can we generalize this translation? Can we shorten the necessary steps that have to be taken? Or is the absolute subjective, artistic interpretation the key factor to make visuals tangible?

As a musician, performer and multimedia-artist Max Schweder works for the international, inclusive and interdisciplinary performing arts company Un-Label. Un-Label pushes towards a more diverse, mixed-abled cast on stage and challenge themselves to make their performances as accessible to anybody as possible. They use artistic creativity and technology to achieve these goals. With his electro-duo CYLVESTER, Max Schweder plays concerts and parties all over Germany. CYLVESTER developed their very own visualization CYLvision in collaboration with Dr. Chikashi Miyama. This first project was funded by the city of cologne.

music-video FRANK by CYLVESTER (2018), showcasing CYLvision:

As a researcher at the Academy for Theater and Digitality in Dortmund (Germany), Max Schweder is developing SOUNDVISION, the open source successor of CYLvision.

SOUNDVISION is being developed as an opensource Unity and PureData (Pd) based artistic toolset for reactive real-time-visualizations of sound, music and movement with the goal of making performances more visually perceivable, also for example for a deaf audience.

Many core elements of the code, such as the SharedMemory Object for Pd, necessary for sending analyzation-data from Pd to Unity have been developed by Collaborator and Mentor Dr. Chikashi Miyama.

In this research it is important to not use visuals as a mere translational system. The main interest — which more and more becomes Max Schweders expertise — lies in working on music and their visuals simultaneously, where one artform reciprocally informs and inspires the other. Decisions and actions echo back and forth during the creative process.

Reactively visualizing all dimensions of a sound, is a complex task. Not only common or just bipolar parameters, such as dynamics, pitch or articulation, but also the more complex relations and connections between sounds can be visualized. Additionally, a sound-reactive virtual embodiment of a performer, provided by e.g. 3D Camera Input - can furthermore amplify the connection between performer and sound.

  • Computer running Windows 10

Recommended minimum Sensor SDK configuration for Windows is: Seventh Gen Intel® CoreTM i3 Processor (Dual Core 2.4 GHz with HD620 GPU or faster) 4 GB Memory. Dedicated USB3 port. Graphics driver support for OpenGL 4.4 or DirectX 11.0.

  • Azure Kinect
  • Multi-Channel Audio Interface
  • Midi Interface
  • Windows 10
  • Unity (refer to repo for version)
  • PureData (refer to repo for version)
  • Azure Kinect SDK (refer to repo for version)


For installation and setup, follow the instructions in the Repository

How many Channels can be analyzed simultaneously? Currently 16 Audio Channels are constantly being analyzed simultaneously.

What type of data do you get from the channels?

  1. Loudness (float)
  2. Pitch (float)
  3. Loudness of manually chosen parts of the frequency spectrum (float)
  4. Noisiness(float)
  5. Index of a closest match of a list of trained sounds using timbreID (integer)
  1. spectrum (array)
  2. spectrogram (texture)
  3. waveform (texture

What type of data do you get from the AzureKinect?

  1. RGB Image (2d texture)
  2. Depth Image (3d texture)
  3. Frame Difference, between two frames (2d texture)

SOUNDVISION can be cued manually by clicking the Next/Previous/Reset button in the Pd-Patch, or it can be cued by external Midi-Inputs. Refer to Tutorial-Videos for more infos. A cuelist is achieved in the form of a .csv file that is being read by a C#-script in Unity. Different sates of visualization are achieved by utilizing Unities timeline feature. The “playhead” of the time linke is moved to different markers in the timeline, following the cue-list.

SOUNDVISION also has a Script following to the method CYLVESTER is cueing it's shows, utilizing Midi-Program-Changes coming from the CYLVESTER-Instrument-Setup.

  • projects/soundvision/start.txt
  • Last modified: 18.11.2020 16:33
  • by Roman Senkl