Questions tagged [sampling]

In signal processing, sampling is the reduction of a continuous signal to a discrete signal. In statistics, sampling is the selection of a subset of individuals from within a statistical population to estimate characteristics of the whole population.

sampling
Filter by
Sorted by
Tagged with
93 votes
13 answers
75k views

Take n random elements from a List<E>?

How can I take n random elements from an ArrayList<E>? Ideally, I'd like to be able to make successive calls to the take() method to get another x elements, without replacement.
user avatar
79 votes
2 answers
28k views

What does replacement mean in numpy.random.choice?

Here explains the function numpy.random.choice. However, I am confused about the third parameter replace. What is it? And in which case will it be useful? Thanks!
wking's user avatar
  • 1,343
58 votes
8 answers
23k views

Algorithms for determining the key of an audio sample

I am interested in determining the musical key of an audio sample. How would (or could) an algorithm go about trying to approximate the key of a musical audio sample? Antares Autotune and Melodyne ...
Alex's user avatar
  • 4,884
55 votes
1 answer
3k views

Abysmal OpenCL ImageSampling performance vs OpenGL TextureSampling

I've recently ported my volumeraycaster from OpenGL to OpenCL, which decreased the raycaster's performance by about 90 percent. I tracked the performance decrease to the OpenCL's imagesampling ...
user1449137's user avatar
48 votes
5 answers
108k views

Random Sample of a subset of a dataframe in Pandas

I have a pandas DataFrame with 100,000 rows and want to split it into 100 sections with 1000 rows in each of them. How do I draw a random sample of certain size (e.g. 50 rows) of just one of the 100 ...
WGP's user avatar
  • 738
40 votes
12 answers
20k views

How to generate a random 4 digit number not starting with 0 and having unique digits?

This works almost fine but the number starts with 0 sometimes: import random numbers = random.sample(range(10), 4) print(''.join(map(str, numbers))) I've found a lot of examples but none of them ...
Menon A.'s user avatar
  • 580
40 votes
1 answer
52k views

What are chunks, samples and frames when using pyaudio

After going through the documentation of pyaudio and reading some other articles on the web, I am confused if my understanding is correct. This is the code for audio recording found on pyaudio's site:...
shiva's user avatar
  • 2,543
38 votes
8 answers
98k views

Stratified random sampling from data frame

I have a data frame in the format: head(subset) # ants 0 1 1 0 1 # age 1 2 2 1 3 # lc 1 1 0 1 0 I need to create new data frame with random samples according to age and lc. For example I want ...
user3525533's user avatar
27 votes
3 answers
866 views

FloatingPointError from PyMC in sampling from a Dirichlet distribution

After being unsuccessful in using decorators to define the stochastic object of the "logarithm of an exponential random variable", I decided to manually write the code for this new distribution using ...
Cupitor's user avatar
  • 11.3k
22 votes
1 answer
32k views

How to draw waveform of Android's music player? [closed]

one of the default live wallpapers that came with my phone was a wallpaper that displayed the wave form of music playing in the background in real time. I was wondering how one could go about doing ...
INeedHelpWithWaveforms's user avatar
22 votes
3 answers
3k views

Use R to Randomly Assign of Participants to Treatments on a Daily Basis

The Problem: I am attempting to use R to generate a random study design where half of the participants are randomly assigned to "Treatement 1" and the other half are assigned to "Treatment 2". ...
Wu Wei's user avatar
  • 380
19 votes
1 answer
13k views

sample random point in triangle [closed]

Suppose you have an arbitrary triangle with vertices A, B, and C. This paper (section 4.2) says that you can generate a random point, P, uniformly from within triangle ABC by the following convex ...
dsg's user avatar
  • 13k
17 votes
5 answers
19k views

Random Sampling from Mongo

I have a mongo collection with documents. There is one field in every document which is 0 OR 1. I need to random sample 1000 records from the database and count the number of documents who have that ...
Aditya Singh's user avatar
17 votes
2 answers
7k views

Is there an algorithm for weighted reservoir sampling? [closed]

Is there an algorithm for how to perform reservoir sampling when the points in the data stream have associated weights?
Budhapest's user avatar
  • 601
15 votes
3 answers
30k views

How to perform under sampling in scikit learn?

We have a retinal dataset wherein the diseased eye information constitutes 70 percent of the information whereas the non diseased eye constitutes the remaining 30 percent.We want a dataset wherein the ...
Gaurav Patil's user avatar
15 votes
6 answers
8k views

How to keep a random subset of a stream of data?

I have a stream of events flowing through my servers. It is not feasible for me to store all of them, but I would like to periodically be able to process some of them in aggregate. So, I want to ...
twk's user avatar
  • 17.1k
15 votes
3 answers
30k views

How to do a random stratified sampling with Python (Not a train/test split)?

I am looking for the best way to do a random stratified sampling like survey and polls. I don't want to do a sklearn.model_selection.StratifiedShuffleSplit since I am not doing a supervised learning ...
asl's user avatar
  • 471
14 votes
4 answers
3k views

Random sampling to give an exact sum

I want to sample 140 numbers between 1000 to 100000 such that the sum of these 140 numbers is around 2 million (2000000): sample(1000:100000,140) such that: sum(sample(1000:100000,140)) = 2000000 ...
Hardik Gupta's user avatar
  • 4,750
14 votes
6 answers
16k views

Select cells randomly from NumPy array - without replacement

I'm writing some modelling routines in NumPy that need to select cells randomly from a NumPy array and do some processing on them. All cells must be selected without replacement (as in, once a cell ...
robintw's user avatar
  • 28.1k
14 votes
1 answer
6k views

Efficiently picking a random element from a chained hash table?

Just for practice (and not as a homework assignment) I have been trying to solve this problem (CLRS, 3rd edition, exercise 11.2-6): Suppose we have stored n keys in a hash table of size m, with ...
Bicheng.Cao's user avatar
14 votes
8 answers
20k views

OpenCV, how to use arrays of points for smoothing and sampling contours?

I have a problem to get my head around smoothing and sampling contours in OpenCV (C++ API). Lets say I have got sequence of points retrieved from cv::findContours (for instance applied on this this ...
Quentin Geissmann's user avatar
13 votes
4 answers
30k views

Stratified splitting of pandas dataframe into training, validation and test set

The following extremely simplified DataFrame represents a much larger DataFrame containing medical diagnoses: medicalData = pd.DataFrame({'diagnosis':['positive','positive','negative','negative','...
Oblomov's user avatar
  • 9,263
13 votes
1 answer
40k views

Taking a disproportionate sample from a dataset in R

If I have a large dataset in R, how can I take random sample of the data taking into consideration the distribution of the original data, particularly if the data are skewed and only 1% belong to a ...
simplyme's user avatar
  • 221
13 votes
4 answers
5k views

Profiling a (possibly I/O-bound) process to reduce latency

I want to improve the performance of a specific method inside a larger application. The goal is improving latency (wall-clock time spent in a specific function), not (neccessarily) system load. ...
Arnout Engelen's user avatar
12 votes
4 answers
7k views

Oversampling functionality in Tensorflow dataset API

I would like to ask if current API of datasets allows for implementation of oversampling algorithm? I deal with highly imbalanced class problem. I was thinking that it would be nice to oversample ...
K Kolasinski's user avatar
11 votes
2 answers
8k views

Profilers Instrumenting Vs Sampling

I am doing a study to between profilers mainly instrumenting and sampling. I have came up with the following info: sampling: stop the execution of program, take PC and thus deduce were the program is ...
Syntax_Error's user avatar
  • 6,092
10 votes
5 answers
44k views

Latin hypercube sampling with python

I would like to sample a distribution defined by a function in multiple dimensions (2,3,4): f(x, y, ...) = ... The distributions might be ugly, non standard (like a 3D spline on data, sum of ...
user2393987's user avatar
10 votes
2 answers
16k views

Android: startRecording() called on an uninitialized AudioRecord when SAMPLERATE set to 44100

I get an error, when I set the sampling rate to 44100 for the AudioRecord object. When it's 22050 it works fine. 02-16 10:45:45.099 24021-24021/com.vlad.jackcomms E/AudioRecord﹕ frameCount 1024 < ...
user3333414's user avatar
10 votes
1 answer
17k views

How to sample on condition with pandas?

I hava a dataframe df like the following: Col1 Col2 0 1 T 1 1 B 2 3 S 3 2 A 4 1 C 5 2 A etc... I would like to create two dataframes: ...
zzzbbx's user avatar
  • 10.9k
9 votes
2 answers
30k views

Audio samples per second?

I am wondering on the relationship between a block of samples and its time equivalent. Given my rough idea so far: Number of samples played per second = total filesize / duration. So say, I have a 1....
user488792's user avatar
  • 1,973
9 votes
2 answers
4k views

Audio sample frequency rely on channels?

If you have audio encoded at 44100Hz that means you have 44100 samples per second. Does this mean 44100 samples/sec for a channel, or for all channels? For example if a song is stereo and encoded at ...
goocreations's user avatar
  • 2,976
9 votes
3 answers
3k views

Sampling from MultiIndex DataFrame

I'm working with the following panel data in a MultiIndex pandas DataFrame called df_data: y x n time 0 0 0.423607 -0.307983 1 0.565563 -0....
J Jung's user avatar
  • 93
9 votes
2 answers
7k views

How to equidistant resample a line (or curve)?

I have a line l_1 given with a point series p_1,...,p_n. I now want a new line l_2 having k points: q_1,...,q_k. But for all i \in {1,...,k-1}: abs( q_i - q_i+1 ) = const, meaning the segments of l_2 ...
math's user avatar
  • 8,652
9 votes
4 answers
1k views

How to select points at a regular density

how do I select a subset of points at a regular density? More formally, Given a set A of irregularly spaced points, a metric of distance dist (e.g., Euclidean distance), and a target density d, how ...
h2kyeong's user avatar
  • 447
9 votes
2 answers
8k views

Android MediaRecorder Sampling Rate and Noise

I have an issue using Android's MediaRecorder to record sound from microphone to .m4a files (AAC-LC, MPEG-4 container). Starting from API level 18, the default sampling rate drops from 44.1 or 48 kHz ...
user avatar
9 votes
1 answer
12k views

librosa.load() takes too long to load(sample) mp3 files

I am trying to sample (convert analog to digital) mp3 files via the following Python code using the librosa library, but it takes too much time (around 4 seconds for one file). I suspect this is ...
john doe's user avatar
  • 437
9 votes
3 answers
3k views

what is the difference between sampled_softmax_loss and nce_loss in tensorflow?

i notice there are two functions about negative Sampling in tensorflow to compute the loss (sampled_softmax_loss and nce_loss). the paramaters of these two function are similar, but i really want to ...
王乐义's user avatar
8 votes
2 answers
1k views

Why does random sampling scale with the dataset not the sample size? (pandas .sample() example)

When sampling randomly from distributions of varying sizes I was surprised to observe that execution time seems to scale mostly with the size of the dataset being sampled from, not the number of ...
c_layton's user avatar
8 votes
1 answer
3k views

Pandas: Sampling from a DataFrame according to a target distribution

I have a Pandas DataFrame containing a dataset D of instances drawn from a distribution x. x may be a uniform for example. Now, I want to draw n samples from D, sampled according to some new ...
meow's user avatar
  • 975
8 votes
2 answers
2k views

How do I Sample each group from a pandas data frame at different rates

I have a data frame containing information about a population that i wish to generate a sample from. I also have a dataframe sample_info that details how many units of each group in the population ...
Ryan's user avatar
  • 142
8 votes
2 answers
2k views

How to sample/partition panel data by individuals( preferably with caret library)?

I would like to partition panel data and preserve the panel nature of the data: library(caret) library(mlbench) #example panel data where id is the persons identifier over years ...
Googme's user avatar
  • 914
8 votes
0 answers
888 views

Sampling from a joint distribution in Pyro

I understand how to sample from multidimensional categorical, or multivariate normal (with dependence within each column). For example, for a multivariate categorical, this can be done as below: ...
alpaca's user avatar
  • 1,231
7 votes
3 answers
2k views

Stratified sampling on factor

I have a dataset of 1000 rows with the following structure: device geslacht leeftijd type1 type2 1 mob 0 53 C 3 2 tab 1 64 G 7 3 pc ...
karmabob's user avatar
  • 105
7 votes
1 answer
12k views

"incorrect number of probabilities" error using sample()

I was trying sample(), however whenever I used custom probability in it ,it constantly displays "incorrect number of probabilities" I've tried pretty much everything but still stuck. Kindly guide me ...
blackhawk's user avatar
7 votes
2 answers
17k views

Why set.seed() affects sample() in R

I always thought set.seed() only makes random variable generators (e.g., rnorm) to generate a unique sequence for any specific set of input values. However, I'm wondering, why when we set the set.seed(...
rnorouzian's user avatar
  • 7,457
7 votes
7 answers
634 views

name of algorithm related to load balancing / re-distribution

Given an array [x1, x2, x3, ..., xk ] where xi is the number of items in box i, how can I redistribute the items so that no box contains more than N items. N is close to sum(xi)/k -- That is, N is ...
Gus's user avatar
  • 4,415
7 votes
2 answers
9k views

Sampling from a given probability distribution using R

Given the probability distribution as follows: x-coordinate represents hours, y-coordinate means the probability for each hour. The problem is how to generate a set of 1000 random data that follows ...
Gamp's user avatar
  • 319
7 votes
1 answer
13k views

How to draw N random samples from a vector in R?

I have a vector with 663 elements. I would like to create random samples from the vector equal to the length of the vector (i.e. 663). Said differently, I would like to take random samples from all ...
RTrain3k's user avatar
  • 857
7 votes
1 answer
2k views

How to repeat 1000 times this random walk simulation in R?

I'm simulating a one-dimensional and symmetric random walk procedure: y[t] = y[t-1] + epsilon[t] where white noise is denoted by epsilon[t] ~ N(0,1) in time period t. There is no drift in this ...
Übel Yildmar's user avatar
7 votes
5 answers
5k views

Randomly sampling unique subsets of an array

If I have an array: a = [1,2,3] How do I randomly select subsets of the array, such that the elements of each subset are unique? That is, for a the possible subsets would be: [] [1] [2] [3] [1,2] [...
meagerf's user avatar
  • 71

1
2 3 4 5
33